Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add tests for ingestion of files with missing columns #128

Open
diskontinuum opened this issue Sep 29, 2020 · 2 comments
Open

Add tests for ingestion of files with missing columns #128

diskontinuum opened this issue Sep 29, 2020 · 2 comments

Comments

@diskontinuum
Copy link
Contributor

diskontinuum commented Sep 29, 2020

See also pycytominer/issues/79.

@diskontinuum diskontinuum changed the title Add testTest ingestion of files with missing Add tests for ingestion of files with missing Sep 29, 2020
@diskontinuum diskontinuum changed the title Add tests for ingestion of files with missing Add tests for ingestion of files with missing columns Sep 29, 2020
@diskontinuum
Copy link
Contributor Author

Please ignore several involuntary enter-typos for the title above ;)

@diskontinuum
Copy link
Contributor Author

The current cytominer-database version (only with the --parquet option) deals with missing columns explicitly by aligning dataframes (that will be concatenated and written to a Parquet file) with a specific reference dataframe:
`aligned_df, ref_df = df.align(ref_df, axis=1)``

Missing columns will be added as NaN-valued columns by default (with same column name and type as the reference frame). To choose a specific value, we could add the parameter fill_value to pandas.dataframe.align() .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant