-
Notifications
You must be signed in to change notification settings - Fork 643
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Param info util: dict and string summary #288
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very cool. Here is my old internal version (I had it for TF and changed it to support TF and JAX):
https://gist.github.com/Marvin182/4c87f14b01aa1bf481d312e36e32332e
Yours look much nicer though.
This is awesome! I have a small request. Would it be possible to refactor out logic which return the number/count of parameters (something similar to @Marvin182 's implemented method 'count_parameters' in https://gist.github.com/Marvin182/4c87f14b01aa1bf481d312e36e32332e). This would be helpful in testing model definition in examples. I currently have TODOs in #287 and #289 |
Thanks! Avital wanted that as a HOWTO, which I have here #277. I'll consult with him :) |
Codecov Report
@@ Coverage Diff @@
## master #288 +/- ##
==========================================
- Coverage 79.39% 77.63% -1.76%
==========================================
Files 34 34
Lines 2252 2312 +60
==========================================
+ Hits 1788 1795 +7
- Misses 464 517 +53
Continue to review full report at Codecov.
|
…, sorting parameters by layer index, support for dynamic sizing and nested dicts
|
||
def _name_idx(name: str): | ||
"""Returns the layer index of the parameter name.""" | ||
index = name[name.find('_') + 1 : name.find('/')] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is bad style. New layers might not follow this convention and the param info shouldn't rely on it. I actually have good experience with just sorting alphabetical by name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rolandgvc - I wouldn't try to sort parameters by the "layer order", in general the "order" of layers is a partial order given that arbitrary dataflow can happen in complex modules. Also this ordering information is lost for manually named layers.
@avital - the alternative to names is an opaque tree-structure-based serialization (which we used in trax) and it is a complete nightmare to work with when debugging. Pragmatically, I am happy to suffer Hyrum's law if it enables vastly more pleasant debugging. I want a human-navigable "at rest" representation of models.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe we can get "the best of both worlds" with easy
introspection while allowing people to look through ordered lists when
those are semantically meaningful.
In the meanwhile I propose we add a strong TODO comment here saying that
we should use this as a use-case for the ongoing API rewrite considerations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could also be resolved by having the params as OderedDicts instead so they can keep the order of the applied modules by default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's just add a comment for now and I think we can merge this.
@levskaya I believe we can get "the best of both worlds" with easy
introspection while allowing people to look through ordered lists when
those are semantically meaningful.
…On Wed, Jun 10, 2020 at 7:25 AM Anselm Levskaya ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In flax/nn/utils.py
<#288 (comment)>:
> +
+def flatten_dict(input_dict: Dict[str, Any], prefix: str = "") -> Dict[str, Any]:
+ """Flattens the keys of a nested dictionary."""
+ output_dict = {}
+ for key, value in input_dict.items():
+ nested_key = "{}/{}".format(prefix, key) if prefix else key
+ if isinstance(value, dict):
+ output_dict.update(flatten_dict(value, prefix=nested_key))
+ else:
+ output_dict[nested_key] = value
+ return output_dict
+
+
+def _name_idx(name: str):
+ """Returns the layer index of the parameter name."""
+ index = name[name.find('_') + 1 : name.find('/')]
@rolandgvc <https://github.com/rolandgvc> - I wouldn't try to sort
parameters by the "layer order", in general the "order" of layers is a
partial order given that arbitrary dataflow can happen in complex modules.
Also this ordering information is lost for manually named layers.
@avital <https://github.com/avital> - the alternative to names is an
opaque tree-structure-based serialization (which we used in trax) and it is
a complete nightmare to work with when debugging. Pragmatically, I am happy
to suffer Hyrum's law if it enables vastly more pleasant debugging. I want
a human-navigable "at rest" representation of models.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#288 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAAJFUSIZKXMTGIYSMB5A4LRV4KLFANCNFSM4NMGXH4Q>
.
|
See CLU (Common Loop Utils) for another implementation: https://github.com/google/CommonLoopUtils/blob/master/clu/parameter_overview.py |
Hey! If adding a dependency is not too much of a burden, maybe we could leverage the rich.tables, they provide a lot of format options, colors, and other stuff like properly handing multi-line rows. Here is an example of rich in action: |
@cgarciae that looks really good. But I wonder if we can/should separate the needs: Part 1 is a function that give a Flax module (and variables and inputs?) generates some simple dict output that describes the module heirarchy. Then part 2 is a small piece of code that reads this dict and calls into rich.tables. Then we could replace part 2 with any other renderer. WDYT? We could even put the "renderer" part in a separate pip package to remove the dependency of flax on it. It's also worth comparing and contrasting with https://dm-haiku.readthedocs.io/en/latest/notebooks/visualization.html |
I think splitting the functionality makes a lot of sense, the current proposed implementation in the PR more or less does this via the separation between
Later on |
Closing in favor of #1844 |
This is a proposal for two parameter summary utils, one that returns the information as a dict and one as a string.