-
Notifications
You must be signed in to change notification settings - Fork 3.7k
Registry refactor #1410
Registry refactor #1410
Conversation
|
We found a Contributor License Agreement for you (the sender of this pull request), but were unable to find agreements for all the commit author(s) or Co-authors. If you authored these, maybe you used a different email address in the git commits than was used to sign the CLA (login here to double check)? If these were authored by someone else, then they will need to sign a CLA as well, and confirm that they're okay with these being contributed to Google. |
e798e7f to
24f32a5
Compare
|
CLAs look good, thanks! |
|
Same tests failing on master (and as far as I can tell, unrelated to anything touched here). |
|
Excellent, thank you! At first glance, looks good! I’ll review later today. |
|
One possible change would be to remove This would make the new API more consistent (always return a function), though I'm not much of an API designer.... If we accept registries will only ever register callables, we could also put in some more specific error checks at the top level (or perhaps a subclass). I'm also thinking putting all registries in a single object for namespacing might be nice, similar to I'll be at a proper desk in a few hours - happy to make changes myself then/take into account further comments. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! Thanks for doing this! Just a few small changes and I think we're good to go.
tensor2tensor/utils/registry.py
Outdated
|
|
||
| def create_registry(registry_name): | ||
| """Create a generic object registry. | ||
| This is the naming function by default for registers expecting classes or |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/registers/registries/g
and in other places where you refer to register (as a noun) it should be registry
tensor2tensor/utils/registry.py
Outdated
| class Registry(object): | ||
| """Dict-like class for managing registrations.""" | ||
| def __init__( | ||
| self, register_name, default_key_fn=default_name, validator=None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/register_name/name/g
tensor2tensor/utils/registry.py
Outdated
| value_transformer (optional): if run, `__getitem__` will return | ||
| value_transformer(key, registered_value). | ||
| """ | ||
| self._register = {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
self._registry
tensor2tensor/utils/registry.py
Outdated
| if callback is not None: | ||
| callback(key, value) | ||
|
|
||
| def register(self, key=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add method docstrings here and in any other method that's > a few lines.
at least a 1-liner.
if > 1 line docstring, make sure it has an Args: section and a Returns: section
tensor2tensor/utils/registry.py
Outdated
| "Available optimizers:\n %s" | ||
| % (name, "\n".join(list_optimizers()))) | ||
| return _OPTIMIZERS[name] | ||
| def _get(self, key): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rm method. can use __getitem__ in the same way (e.g. model = model_registry.__getitem__)
| if prefix: | ||
| return [name for name in _HPARAMS if name.startswith(prefix)] | ||
| return list(_HPARAMS) | ||
| def get(self, key, d=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/d/default
| was_reversed: A boolean. | ||
| was_copy: A boolean. | ||
| """ | ||
| # Recursively strip tags until we reach a base name. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you update the var names in this method to be consistent? e.g. was_rev vs was_reversed
tensor2tensor/utils/registry.py
Outdated
| model_registry = Registry("models", on_set=_on_model_set) | ||
| optimizer_registry = Registry( | ||
| "optimizers", | ||
| default_key_fn=lambda fn: misc_utils.snakecase_to_camelcase(fn.__name__), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's actually rm this. have snakecase as the standard, no special case for optimizers.
tensor2tensor/utils/registry.py
Outdated
|
|
||
| # consistent version of old API | ||
| model = model_registry._get | ||
| list_models = lambda: sorted(model_registry) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's rm sorted here and instead have __iter__ return sorted(self._registry). these can be lambda: list(model_registry)
tensor2tensor/utils/registry.py
Outdated
| register_problem = register_base_problem | ||
|
|
||
|
|
||
| def problem(problem_name, base_registry=base_problem_registry): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rm base_registry argument. can just use base_problem_registry in the fn body
|
I think the value transformer is nice, so let's keep that for now. |
tensor2tensor/utils/registry.py
Outdated
| return decorator(rhp_fn, registration_name=default_name(rhp_fn)) | ||
| def _nargs_validator(nargs, message): | ||
| def f(key, value): | ||
| args, varargs, keywords, _ = inspect.getargspec(value) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use inspect.getfullargspec instead. looks like getargspec is deprecated
|
All changes implemented except Seems a bit weird to only expose an iterator to the sorted keys, rather than the sorted keys themselves. I suppose we could make I've updated optimizer naming convention to mostly snake case - the only exception being Thanks for the feedback. For future reference: is this the standard work-flow for pull requests? Throw something up, get feedback, make tweaks? I'm very much a researcher used to working with very small groups... usually 1 (i.e. just me), or with someone I can shout to down the corridor... and while this hasn't been a painful experience by any stretch, if there's a nicer workflow then I'm only not using it out of ignorance rather than being "set in my ways". |
|
Fine with me. Thanks for these changes! Yes, this is the standard workflow. Make a pull request, go through 1+ rounds of code review, merge. For things that are larger and/or require more discussion, we tend to write it up and/or talk synchronously (in-person, chat, or video chat). It's definitely different than small groups working on small codebases where everybody might be sitting right next to each other and everybody might have direct push access to the repo, but we've found it to be a good set of tradeoffs across scalability/friction/safety (every commit at Google goes through the same code review process). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, sorry, looks like there was some usage of create_registry in the meantime (see tests). I think the line mtf_transformer2.py can be replaced with:
layers_registry = registry.Registry("layers")
|
I've moved I can see the argument for defining registries closer to where they are predominantly used (as in the |
|
Thanks @jackd! |
|
... did something weird happen with the commits here? I see Sorry for the confusion, still getting my head around git... |
|
No, not your fault at all. This is some weirdness with our system because
we’re syncing code back and forth between GitHub and Google’s internal
version control system. Another commit will come out tomorrow with your
change applied. Will update here when it’s out. Sorry about that.
…On Mon, Jan 28, 2019 at 6:57 PM Dominic Jack ***@***.***> wrote:
... did something weird happen with the commits here? I see Merge branch
'master' into register_refactor follwed by merged commit xxx into
tensorflow:master... and then there's this
<jackd@2aa437a...8a18032>
which has me thoroughly confused (I mean, it looks great, but not sure it
has anything to do with me or registry changes). registry.py on
tensorflow/tensor2tensor
<https://github.com/tensorflow/tensor2tensor/blob/master/tensor2tensor/utils/registry.py>
fork doesn't have registry changes either...
Sorry for the confusion, still getting my head around git...
—
You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub
<#1410 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABEGW3Z-d31AZ7m0Jaq5JImuB82gIPGpks5vH7iLgaJpZM4aSeWs>
.
|
PiperOrigin-RevId: 231486407
|
Now actually permanently merged with this commit. |
* registry refactor and deprecated call-site updates * added on_problem_set callback, simplified name * changed optimizer registration names to snake_case, documentation * removed create_registry
PiperOrigin-RevId: 231486407
Re-write of
utils.registryto a dict-like interface as per discussion here.Common code pulled into
dict-likeRegistryclass. Old interface remains (registry.register_problem,registry.hparamsetc), though this does mean there is a fair bit of aliasing (e.g.register_problem = problem_registry.register).Removed registry registry (i.e. the registry containing registries) - I don't see why it was necessary. The point of registries (as far as I can tell) is to allow
tensor2tensorto play nicely with external code throught2t_usr_dir. If external code wants to add a registry thattensor2tensorknows nothing about, thentensor2tensorwon't be calling it, so it doesn't need to be registered... unless I'm missing something.Marked a couple of function names deprecated, since they are inconsistent with naming convention of other functions in the old registry version. All aliased functions could potentially be deprecated, though I don't see much point.