Skip to content
This repository has been archived by the owner on Oct 9, 2023. It is now read-only.

ImageClassificationData.from_data_frame is not working when predict_data_frame is provided #1086

Closed
daMichaelB opened this issue Dec 23, 2021 · 3 comments · Fixed by #1088
Closed
Labels
bug / fix Something isn't working help wanted Extra attention is needed

Comments

@daMichaelB
Copy link
Contributor

🐛 Bug

When i create a ImageClassificationData object with the from_data_frame factory method, it works as long as i do not set

  • predict_images_root
  • predict_data_frame

Once i set the above flags i get a Value-Error:

ValueError: File ID `path/to/file.png` did not resolve to an existing file. For use cases which involve first converting the ID to a file you should pass a custom resolver when loading the data.

First thing i found is, that the predict_images_root is ignored and it is therefore not finding the image.
Digging a little bit digger, it seems that the .from_data_frame is not properly calling ImageClassificationDataFrameInput:

  train_data = (train_data_frame, input_field, target_fields, train_images_root, train_resolver)
  val_data = (val_data_frame, input_field, target_fields, val_images_root, val_resolver)
  test_data = (test_data_frame, input_field, target_fields, test_images_root, test_resolver)
  predict_data = (predict_data_frame, input_field, predict_images_root, predict_resolver)

Suggestion for fix

The last line should be replaced with

predict_data = (predict_data_frame, input_field, None, predict_images_root, predict_resolver)

This way the target_fields are set to None and the values are correctly set.

To Reproduce

Steps to reproduce the behavior:

Just use the from_data_frame factory with prediction data, like

        datamodule = ImageClassificationData.from_data_frame(
            "file", "label",
            train_images_root="/my/root/folder/",
            val_images_root="/my/root/folder/",
            test_images_root="/my/root/folder/",
            train_data_frame=df[df.split == "train"],
            val_data_frame=df[df.split == "valid"],
            test_data_frame=df[df.split == "test"],
            predict_images_root="/my/root/folder/",
            predict_data_frame=df

Expected behavior

see Suggestion for fix

Environment

  • PyTorch Version (e.g., 1.0): 1.10.0
  • Flash Version: 0.6.0
  • OS (e.g., Linux): Ubuntu on Docker
  • How you installed PyTorch (conda, pip, source): pip with virtualenv
  • Build command you used (if compiling from source):
  • Python version: 3.8.12
  • CUDA/cuDNN version: 11.4
  • GPU models and configuration:
  • Any other relevant information:
@daMichaelB daMichaelB added bug / fix Something isn't working help wanted Extra attention is needed labels Dec 23, 2021
@ethanwharris
Copy link
Collaborator

Hey @daMichaelB thanks for reporting this! Your suggested fix looks good to me 😃 Would you be interested in opening a PR with the fix?

@daMichaelB
Copy link
Contributor Author

Hello @ethanwharris yes i would love to do that! Give me a little bit of time then i will also add a small test for it, if required?

@ethanwharris
Copy link
Collaborator

Awesome, a test would be ideal! No rush 😃 It looks like we could do with a data frame test in here: https://github.com/PyTorchLightning/lightning-flash/blob/master/tests/image/classification/test_data.py

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug / fix Something isn't working help wanted Extra attention is needed
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants