Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

layout.get_collections drops user-added columns (and 2 bonus questions) #273

Closed
toddt opened this issue Oct 25, 2018 · 1 comment · Fixed by #432
Closed

layout.get_collections drops user-added columns (and 2 bonus questions) #273

toddt opened this issue Oct 25, 2018 · 1 comment · Fixed by #432
Labels

Comments

@toddt
Copy link

toddt commented Oct 25, 2018

As discussed on neurostars, here:
https://neurostars.org/t/replicable-scripts-bids-and-curating-data/2623

I'm trying to document and automate the runs that I'm including in first-level analyses by adding custom columns to the scans.tsv file in the BIDS subject directory.

I've added a few columns to document excluded runs, as seen in the attached scans file (renamed to .txt extension to make github happy).
sub-SAXEIB06_scans.txt

When I try to use

bvcSessList = layout.get_collections(level='session',subject=sub)
df = bvcSessList[0].to_df()
print(bvcSessList[0].variables)

one of the columns ("OtherExclusion") has been dropped from both the df and the variables list.

I'm pretty sure that's happening because the column is a duplicate of another exclusion column ("RepeatSubjectExclusion") that has the same values of False for all runs, and this line kills it:

_data = _data.T.drop_duplicates().T

I can code around this problem in a few ways, but it seems like maybe not the ideal behavior for get_collections().

Bonus questions:

  1. the scans.tsv filename is parsed into modality/run/type/subject/task correctly, and those columns show up in the dataframe, but I can't find them (or the original filename field) in the variables or entities dictionaries. I'd think that they should be available, no?

  2. get_collections(level='session') only seems to return the func modality, and omits the anat session in the scans.tsv file. Is this intended behavior?

Thanks!
Todd

@tyarkoni tyarkoni added the bug label Oct 25, 2018
@tyarkoni
Copy link
Collaborator

I agree that the de-duplication is not desirable behavior—will fix that. (Feel free to open a PR; we can try just removing that line and seeing if anything breaks.)

Re (1): are you saying they're not in the .entities attribute of the BIDSVariable? That's odd, and sounds like a bug. They're also in .index (as a DataFrame), and you can force them to be set in .entities by calling ._index_entities on a BIDSVariable. I'm not sure why they're not getting set properly—probably the internal indexing call is being called too early or something. Will look into it.

Re: (2): It's sort of intended behavior in that I was initially focusing exclusively on getting BOLD-related data to work. At this point making it possible to grab other data is technically possible with small modifications. I'll open an issue, but I can't guarantee it'll work quite right without more testing than I have time for right now.

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants