-
Notifications
You must be signed in to change notification settings - Fork 31.6k
Add TF image classification example script #19956
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add TF image classification example script #19956
Conversation
|
The documentation is not available anymore as the PR was closed or merged. |
37a585f to
95c3a62
Compare
sgugger
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for working on this!
It looks like your branch is a bit old and does not contain some fixes made to make sure the example tests run when an example is modified (you can see the test examples are not running here 😅 ). Could you try a rebase on main?
95c3a62 to
270bfb0
Compare
|
Still no good, I scanned the test fetcher and found a potential bug. COuld you add elif f.startswith("examples/tensorflow"):
test_files_to_run.append("examples/flax/test_tensorflow_examples.py")at the line 562 of |
@sgugger I've rebased from upstream main and force pushed again. If I run The examples still aren't running and I can't see in the diff where I could be overriding this 😅 I'm sure there's something I'm overlooking. I'll keep digging but let me know if there's something else I should be doing. Sorry to bother. |
|
@sgugger - sorry late on the comment as well. I'll add your suggestion! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a way to make the test run in less than 30/40s? Would be nice for it to be run on the CI.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Definitely - I'll play with the model / dataset / number of samples.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, this was paused for quite a while because the script was taking ages on the prepare_tf_dataset step. Thanks to the sleuthing from @Rocketknight1 and the subsequent to_tf_dataset updates and using with_format("np") things move a lot faster. Switching to a hf-internal-testing dataset and small model meant it runs < 30s.
Rocketknight1
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks really clean overall! Have you tested that you get similar accuracy after fine-tuning as the PT example? I know the CI test checks this, but the tolerances on quick small tests like that have to be quite wide, so they can miss subtler performance degradation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know this is from torchvision and not your idea, but trying 10 random rolls and then giving up rather than figuring out how to constrain your randomness so it outputs valid solutions the first time is quite funny.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although on reflection I suppose doing it this way gives you an unbiased sample from the space of valid target_area + aspect ratio pairs, and you'd need to handle things quite carefully to make that the case if you constrained the random rolls to always be valid.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't looked at our image models too much, but I'm guessing that the scaling factors feature_extractor.image_mean and feature_extractor.image_std were also computed after division by 255? It seems odd to normalize the scale twice like this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exactly. It's pretty standard to have the values this way in vision - most of our image_mean and image_std values are constants from e.g. imagenet. I think it comes down to being able to use floats - if I found the mean on my R pixels was 134.7 how should I take it away from the pixel values when they're still uint8?
|
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Please note that issues that do not follow the contributing guidelines are likely to be ignored. |
|
Hey @amyeroberts, is this PR still going ahead? It looked almost ready! |
|
@Rocketknight1 Yes - sorry, this fell down my priority list for a bit. Code is all ready to go - I was trying to find models that make the tests run quickly c.f. this comment. |
|
@amyeroberts Ah, that makes sense! It's totally okay to upload your own super-mini model and use that - it doesn't really matter if the accuracy is bad, the test will just let us detect if the outputs from this model class change suddenly |
|
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Please note that issues that do not follow the contributing guidelines are likely to be ignored. |
Co-authored-by: Sylvain Gugger <[email protected]>
…ace#20952) * Add generate kwargs to AutomaticSpeechRecognitionPipeline * Add test for generation kwargs
0dea54f to
8ef50e3
Compare
What does this PR do?
Adds the TF equivalent for the PyTorch image classification example script.
Before submitting
Pull Request section?
to it if that's the case.
documentation guidelines, and
here are tips on formatting docstrings.