Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Metadata state-column can't have a zero value or be non-numeric category #38

Closed
mestaki opened this issue Dec 11, 2020 · 4 comments
Closed

Comments

@mestaki
Copy link

mestaki commented Dec 11, 2020

Hey @cameronmartino,
Finally getting around to trying out this awesome tool, huge congrats on the paper btw.

I was trying a dataset with paired-data with just 2 timepoints when I ran into a problem with my metadata state-column.
I tried 2 different columns that could have represented my states:
timepoint <- has numeric values of 0 and 6 which correspond to weeks, or
Est_status <- categorical values of Baseline, and 6mosEst

Both represent the same values one is numeric the other categorical. I'm on q2-2020.6, with q2-deicode: 0.2.4, and gemelli: 0.0.6 installed
When I run the below with either of those 2 columns:

!qiime gemelli ctf \
    --i-table table.qza \
    --m-sample-metadata-file pilot_metadata2.txt \
    --m-feature-metadata-file taxonomy_silva132.qza \
    --p-state-column Est_status \
    --p-individual-id-column Patient_ID \
    --output-dir gemelli/ \
    --verbose

I get the following error:

/home/mestaki/miniconda3/envs/qiime2-2020.6/lib/python3.6/site-packages/numpy/core/fromnumeric.py:3335: RuntimeWarning: Mean of empty slice.
  out=out, **kwargs)
Traceback (most recent call last):
  File "/home/mestaki/miniconda3/envs/qiime2-2020.6/lib/python3.6/site-packages/q2cli/commands.py", line 329, in __call__
    results = action(**arguments)
  File "<decorator-gen-555>", line 2, in ctf
  File "/home/mestaki/miniconda3/envs/qiime2-2020.6/lib/python3.6/site-packages/qiime2/sdk/action.py", line 245, in bound_callable
    output_types, provenance)
  File "/home/mestaki/miniconda3/envs/qiime2-2020.6/lib/python3.6/site-packages/qiime2/sdk/action.py", line 390, in _callable_executor_
    output_views = self._callable(**view_args)
  File "/home/mestaki/miniconda3/envs/qiime2-2020.6/lib/python3.6/site-packages/gemelli/ctf.py", line 40, in ctf
    feature_metadata)
  File "/home/mestaki/miniconda3/envs/qiime2-2020.6/lib/python3.6/site-packages/gemelli/ctf.py", line 129, in ctf_helper
    n_initializations=n_initializations).fit(tensor_rclr(tensor.counts))
  File "/home/mestaki/miniconda3/envs/qiime2-2020.6/lib/python3.6/site-packages/gemelli/factorization.py", line 171, in fit
    self._fit()
  File "/home/mestaki/miniconda3/envs/qiime2-2020.6/lib/python3.6/site-packages/gemelli/factorization.py", line 214, in _fit
    fillna=self.fillna)
  File "/home/mestaki/miniconda3/envs/qiime2-2020.6/lib/python3.6/site-packages/gemelli/factorization.py", line 534, in tenals
    if rank_estimate(obs_tmp, eps_tmp) >= (min(obs_tmp.shape) - 1):
  File "/home/mestaki/miniconda3/envs/qiime2-2020.6/lib/python3.6/site-packages/gemelli/optspace.py", line 462, in rank_estimate
    r_one = np.argmin(cost)
  File "<__array_function__ internals>", line 6, in argmin
  File "/home/mestaki/miniconda3/envs/qiime2-2020.6/lib/python3.6/site-packages/numpy/core/fromnumeric.py", line 1267, in argmin
    return _wrapfunc(a, 'argmin', axis=axis, out=out)
  File "/home/mestaki/miniconda3/envs/qiime2-2020.6/lib/python3.6/site-packages/numpy/core/fromnumeric.py", line 58, in _wrapfunc
    return _wrapit(obj, method, *args, **kwds)
  File "/home/mestaki/miniconda3/envs/qiime2-2020.6/lib/python3.6/site-packages/numpy/core/fromnumeric.py", line 47, in _wrapit
    result = getattr(asarray(obj), method)(*args, **kwds)
ValueError: attempt to get argmin of an empty sequence

Plugin error from gemelli:

  attempt to get argmin of an empty sequence

See above for debug info.

If I add a random 3rd value in either of these columns it runs successfully. So can gammeli not be used on data with only 2 timepoints then? It doesn't fit in the cross-sectional version either, so this would be a big loss since many projects do use only before/after designs.

@cameronmartino
Copy link
Collaborator

Thanks for reporting and the detailed issue @mestaki! This is a bug and CTF should run for two subject/state tensors. I have put in a specific issue and PR to fix this bug.

@mestaki
Copy link
Author

mestaki commented Mar 11, 2021

Howdy, just wondering if there are any updates on this fix, or estimated timeline for a fix? I want to gauge if I should wait it out for a project I'm working on or try a different approach (fingers crossed for the former).

@cameronmartino
Copy link
Collaborator

Hi @mestaki

Yes, there is a branch with a bug fix waiting for a PR review. You can install and use that branch by running pip install git+https://github.com/biocore/gemelli@two-time-bug.

Thanks for the reminder!

@mestaki
Copy link
Author

mestaki commented Mar 11, 2021

Fantastic, thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants