Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Param info util: dict and string summary #288

Closed
wants to merge 6 commits into from

Conversation

rolandgvc
Copy link
Contributor

@rolandgvc rolandgvc commented May 27, 2020

This is a proposal for two parameter summary utils, one that returns the information as a dict and one as a string.

Copy link
Collaborator

@Marvin182 Marvin182 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very cool. Here is my old internal version (I had it for TF and changed it to support TF and JAX):
https://gist.github.com/Marvin182/4c87f14b01aa1bf481d312e36e32332e
Yours look much nicer though.

flax/nn/utils.py Outdated Show resolved Hide resolved
flax/nn/utils.py Outdated Show resolved Hide resolved
flax/nn/utils.py Outdated Show resolved Hide resolved
flax/nn/utils.py Outdated Show resolved Hide resolved
flax/nn/utils.py Outdated Show resolved Hide resolved
flax/nn/utils.py Outdated Show resolved Hide resolved
flax/nn/utils.py Outdated Show resolved Hide resolved
@mohitreddy1996
Copy link
Contributor

This is awesome!

I have a small request. Would it be possible to refactor out logic which return the number/count of parameters (something similar to @Marvin182 's implemented method 'count_parameters' in https://gist.github.com/Marvin182/4c87f14b01aa1bf481d312e36e32332e).

This would be helpful in testing model definition in examples. I currently have TODOs in #287 and #289

@rolandgvc
Copy link
Contributor Author

This is awesome!

I have a small request. Would it be possible to refactor out logic which return the number/count of parameters (something similar to @Marvin182 's implemented method 'count_parameters' in https://gist.github.com/Marvin182/4c87f14b01aa1bf481d312e36e32332e).

This would be helpful in testing model definition in examples. I currently have TODOs in #287 and #289

Thanks! Avital wanted that as a HOWTO, which I have here #277. I'll consult with him :)

@codecov-commenter
Copy link

codecov-commenter commented May 27, 2020

Codecov Report

Merging #288 into master will decrease coverage by 1.75%.
The diff coverage is 10.52%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #288      +/-   ##
==========================================
- Coverage   79.39%   77.63%   -1.76%     
==========================================
  Files          34       34              
  Lines        2252     2312      +60     
==========================================
+ Hits         1788     1795       +7     
- Misses        464      517      +53     
Impacted Files Coverage Δ
flax/nn/utils.py 51.40% <10.52%> (-46.60%) ⬇️
flax/metrics/tensorboard.py 91.07% <0.00%> (-3.27%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3bc5289...e41e180. Read the comment docs.

flax/nn/utils.py Outdated Show resolved Hide resolved
flax/nn/utils.py Outdated Show resolved Hide resolved
flax/nn/utils.py Outdated Show resolved Hide resolved
flax/nn/utils.py Outdated Show resolved Hide resolved
@rolandgvc rolandgvc changed the title model_summary util Param info util: dict and string summary Jun 4, 2020
@rolandgvc rolandgvc mentioned this pull request Jun 4, 2020

def _name_idx(name: str):
"""Returns the layer index of the parameter name."""
index = name[name.find('_') + 1 : name.find('/')]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @levskaya @jheek weird stuff that happens when we autoincrement layer names as strings -- "everything becomes part of the API"

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is bad style. New layers might not follow this convention and the param info shouldn't rely on it. I actually have good experience with just sorting alphabetical by name.

Copy link
Collaborator

@levskaya levskaya Jun 10, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rolandgvc - I wouldn't try to sort parameters by the "layer order", in general the "order" of layers is a partial order given that arbitrary dataflow can happen in complex modules. Also this ordering information is lost for manually named layers.

@avital - the alternative to names is an opaque tree-structure-based serialization (which we used in trax) and it is a complete nightmare to work with when debugging. Pragmatically, I am happy to suffer Hyrum's law if it enables vastly more pleasant debugging. I want a human-navigable "at rest" representation of models.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe we can get "the best of both worlds" with easy
introspection while allowing people to look through ordered lists when
those are semantically meaningful.

In the meanwhile I propose we add a strong TODO comment here saying that
we should use this as a use-case for the ongoing API rewrite considerations.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could also be resolved by having the params as OderedDicts instead so they can keep the order of the applied modules by default.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's just add a comment for now and I think we can merge this.

@avital
Copy link
Contributor

avital commented Jun 11, 2020 via email

@avital avital added this to the Design notes and patterns milestone Dec 12, 2020
@avital avital modified the milestones: Design notes, Patterns/HOWTOs Dec 29, 2020
@avital avital added Priority: P1 - soon Response within 5 business days. Resolution within 30 days. (Assignee required) Priority: P2 - no schedule Best effort response and resolution. We have no plan to work on this at the moment. and removed Priority: P1 - soon Response within 5 business days. Resolution within 30 days. (Assignee required) labels Dec 29, 2020
@avital avital removed their assignment Feb 13, 2021
@avital
Copy link
Contributor

avital commented Feb 13, 2021

See CLU (Common Loop Utils) for another implementation: https://github.com/google/CommonLoopUtils/blob/master/clu/parameter_overview.py

@cgarciae
Copy link
Collaborator

Hey! If adding a dependency is not too much of a burden, maybe we could leverage the rich.tables, they provide a lot of format options, colors, and other stuff like properly handing multi-line rows.

Here is an example of rich in action:

rich

@avital
Copy link
Contributor

avital commented Jan 20, 2022

@cgarciae that looks really good. But I wonder if we can/should separate the needs: Part 1 is a function that give a Flax module (and variables and inputs?) generates some simple dict output that describes the module heirarchy. Then part 2 is a small piece of code that reads this dict and calls into rich.tables. Then we could replace part 2 with any other renderer. WDYT? We could even put the "renderer" part in a separate pip package to remove the dependency of flax on it.

It's also worth comparing and contrasting with https://dm-haiku.readthedocs.io/en/latest/notebooks/visualization.html

@cgarciae
Copy link
Collaborator

I think splitting the functionality makes a lot of sense, the current proposed implementation in the PR more or less does this via the separation between param_info and show_param_info. I think we could expand the current idea with the following:

  • [optional] Rename param_info to module_info, this would take in a Module, variables, sample inputs, and would use capture_intermediates to add input/output information about the layers to the dict structure.
  • Create a render_info function that could render the structure, internally it would use rich.tables.
  • show_{module,param}_info could have a render_fn that defaults to render_info, it would simply call {module,param}_info and subsequently call render_fn. Users could provide their own render_fn.

Later on render_info could be extracted if deemed generally useful.

@cgarciae cgarciae mentioned this pull request Feb 1, 2022
5 tasks
@avital
Copy link
Contributor

avital commented Feb 3, 2022

Closing in favor of #1844

@avital avital closed this Feb 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla: yes Priority: P2 - no schedule Best effort response and resolution. We have no plan to work on this at the moment.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants