-
Notifications
You must be signed in to change notification settings - Fork 957
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[b/331681978] Add Short URL matching to KaggleFileResolver #1375
Conversation
self.assertEqual([1, 1], layer(test_inputs).shape) | ||
# Delete the files that were created in KaggleJwtHandler's do_POST method |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was needed or else os.makedirs
and os.symlink
would fail when the new test tried to recreate the same files. (Alternative solution was to use a different model URL for the new test, but I thought it was cleaner to do the proper cleanup after each test.)
@@ -4,18 +4,19 @@ | |||
|
|||
from tensorflow_hub import resolver | |||
|
|||
url_pattern = re.compile(r"https?://([a-z]+\.)?kaggle.com/models/(?P<owner>[^\\/]+)/(?P<model>[^\\/]+)/frameworks/(?P<framework>[^\\/]+)/variations/(?P<variation>[^\\/]+)/versions/(?P<version>[0-9]+)$") | |||
short_url_pattern = re.compile(r"https?://([a-z]+\.)?kaggle.com/models/(?P<owner>[^\\/]+)/(?P<model>[^\\/]+)/(?P<framework>[^\\/]+)/(?P<variation>[^\\/]+)/(?P<version>[0-9]+)$") | |||
long_url_pattern = re.compile(r"https?://([a-z]+\.)?kaggle.com/models/(?P<owner>[^\\/]+)/(?P<model>[^\\/]+)/frameworks/(?P<framework>[^\\/]+)/variations/(?P<variation>[^\\/]+)/versions/(?P<version>[0-9]+)$") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was considering if we should load these via an ENV var that we can mutate, but I suppose it is really rare that we mutate this, and it might be fine if only new images can support changes like this going forward.
@@ -4,18 +4,19 @@ | |||
|
|||
from tensorflow_hub import resolver | |||
|
|||
url_pattern = re.compile(r"https?://([a-z]+\.)?kaggle.com/models/(?P<owner>[^\\/]+)/(?P<model>[^\\/]+)/frameworks/(?P<framework>[^\\/]+)/variations/(?P<variation>[^\\/]+)/versions/(?P<version>[0-9]+)$") | |||
short_url_pattern = re.compile(r"https?://([a-z]+\.)?kaggle.com/models/(?P<owner>[^\\/]+)/(?P<model>[^\\/]+)/(?P<framework>[^\\/]+)/(?P<variation>[^\\/]+)/(?P<version>[0-9]+)$") | |||
long_url_pattern = re.compile(r"https?://([a-z]+\.)?kaggle.com/models/(?P<owner>[^\\/]+)/(?P<model>[^\\/]+)/frameworks/(?P<framework>[^\\/]+)/variations/(?P<variation>[^\\/]+)/versions/(?P<version>[0-9]+)$") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's a unified version of the two regex's if we wanted to still only need one:
https?://([a-z]+\.)?kaggle.com/models/(?P<owner>[^\\/]+)/(?P<model>[^\\/]+)(/frameworks)?/(?P<framework>[^\\/]+)(/variations)?/(?P<variation>[^\\/]+)(/versions)?/(?P<version>[0-9]+)$
https://regex101.com/r/JbWJNI/1
All I did was making the "extra" parts of the path optional, though I suppose this does make each part independently optional.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you mind if we keep short_url_pattern
and long_url_pattern
? I lean towards that because it feels more readable and avoids the "independently optional parts" issue you mention
This change is required to unblock https://github.com/Kaggle/kaggleazure/pull/28591, which updates the Kernel MT to only use short URLs.
Testing
Added unit test and ran ./test