Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add "operator" style to Model Library Format #8072

Merged
merged 17 commits into from
Jul 2, 2021

Conversation

areusch
Copy link
Contributor

@areusch areusch commented May 18, 2021

This PR adds Model Library Format support to artifacts created by tvm.build. It introduces a style key to the Model Library Format metadata with two initial values:

  • full-model
  • operator

Implementations that use Model Library Format will now need to check style before reading other files in the archive. full-model indicates the previously-used format.

operator is introduced to allow exporting TVM libraries that contain only operator functions with no model-level information (e.g. executor configuration, model-wide memory planning, etc). The goal of operator style is to allow exporting fragments of models (e.g. individual TVM operators) for use with the TVM RPC Server. After the Project API refactor lands, TVM auto-tuning will produce MLF in operator style, and those MLF archives will be given to project generators with the ultimate goal of flashing and timing those operators on-device.

MLF archives with operator style contain:

  • codegen directory, organized as in full-model
  • metadata.json of the same format as full-model with different values.
  • src/tir-<device_type>.txt, containing pretty-printed TIR

Notably, the memory key in metadata.json contains shape information for each operator function parameter. The shape information has names correlated with those used in the TIR sources in src/tir-*.txt.

@leandron @manupa-arm @giuseros @Mousius @gromero @mehrdadh @stoa

@areusch areusch marked this pull request as ready for review May 20, 2021 17:04
@areusch
Copy link
Contributor Author

areusch commented May 25, 2021

friendly ping! this one is blocking Project API work

Copy link
Contributor

@giuseros giuseros left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @areusch , sorry for the latency. As I said my github notifications stopped working for a while :)



def _export_graph_model_library_format(
mod: executor_factory.GraphExecutorFactoryModule, tempdir: pathlib.Path
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be ExecutorFactoryModule to be compatible with AOT as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment on lines 53 to 55
if (text_printer_.GetVarName(args[0], &var_name)) {
*rv = var_name;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this ICHECK if GetVarName returns false?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i guess the thinking is that the caller can decide what to do. since the function will return None, it should be fairly straightforward. we could raise an exception. are you thinking that users may not check the return value?

Comment on lines +308 to +309
with open(src_dir / f"tir-{target_device_type}.txt", "w") as f:
f.write(printer["print"](ir_mod))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not following why adding the TIR in the archive. Is this for test purposes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's mostly an analogy to adding the relay.txt into the archive--to provide TVM source code for the generated code. though I see your point that TIR is quite close to the generated code.

Copy link
Contributor

@manupak manupak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay!

I think we might want to add to title/description the PR that this is essentially adding MLF as an output format for tvm.build.

Most of comments are related to code locations and usage of runtime.Module over Object/Node. See if you agree.

namespace tvm {
namespace printer {

class ModelLibraryFormatPrinter : public ::tvm::runtime::ModuleNode {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we extending runtime.Module ?
Any reason why we cant use Objects and Nodes to expose this to python?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need GetFunction to expose an object to Python with member functions, and i think the member function style provides a bit more structure to the API than just e.g. placing it all in a tuple. I don't think we can use Node since it doesn't provide the sptr_to_self

Copy link
Contributor

@manupak manupak Jun 10, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was actually referring to other objects with member functions.
E.g. :
https://github.com/manupa-arm/incubator-tvm/blob/master/src/relay/analysis/call_graph.cc
https://github.com/manupa-arm/incubator-tvm/blob/master/python/tvm/relay/analysis/call_graph.py

There are similiar structure in AutoScheduler as well. I always thought this was better than extending runtime.Modules and using packed functions. What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm. i think that this way is a more codified dispatch mechanism (e.g. string table lookup) than that used by Module (GetFunction typically implemented with a series of if statements and closures), but it requires additional duplication of each member function in e.g. Python. I think a proper interface-oriented FFI would just wrap all of this stuff automatically.

my preference with interface FFI functions in TVM is to use Module in general, but add some automation to the build to auto-generate GetFunction and avoid closures in that function. It is a bit weird, I'll admit, since it re-uses a mechanism meant for the runtime. But, I think it's the only generic member-function lookup we have right now, and there are some non-runtime use cases.

Copy link
Contributor

@manupak manupak Jun 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think there is precedence for both approaches. Lets not block this based on this :).
Maybe its better to converge to a single policy when exposing member functions across the FFI.
cc : @tqchen @jroesch .

Though, if the intention is of this class to be used in python, I'd still prefer to have a documented interface in python -- it seems much easier to follow than the indirections via GetFunction.

_populate_codegen_dir(mod, codegen_dir)


ExportableModule = typing.Union[
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure whether the model_library_format.py is the right place to hold this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i guess this is a bit specific to Model Library Format--you can build shared libraries from things we don't know how to export into MLF. happy to change the name, or we can revisit this when we promote MLF to a top-level TVM export format.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack, sounds good for now then.

memory_map = {}
for target_device_type, target in targets.items():
ir_mod = ir_module_by_target[target]
printer = get_global_func("tir.ModelLibraryFormatPrinter")(False, None, False)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel we can have this neatly hidden under _ffi_api.py and move the c++ implementations related to ModelLibraryFormatPrinter to a matching model_library_format.cc.

Why do we think ModelLibraryFormatPrinter belongs to the namespace of tir?

Copy link
Contributor Author

@areusch areusch Jun 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah good point. in src/printer, we have a few entry points:

  • src/printer/tvmscript_printer.cc defines script.AsTVMScript
  • src/printer/text_printer.cc defines ir.PrettyPrint and ir.AsText

so i guess the folder doesn't provide any namespace grouping right now, even though printer implementations are consolidated there. i'm okay moving to micro.ModelLibraryFormatPrinter or ir.ModelLibraryFormatPrinter, if that's what you're suggesting. tir seemed like a fit since that's how we are using it now, though it should work with any IRModule.

could you let me know which namespace you're suggesting to move to?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking of "micro.model_library_format.printer" being the registration and make printer a python function that binds to C++ under _ffi_api.py (similiar to how its done in CallGraph).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in this case, we need the member function to retrieve the mapping--this is why i used Module. as for the namespace, i don't have a strong opinion, but the only micro directory we have in src is src/runtime, and this is clearly not a runtime component. so we'd need to create src/micro, is all. i'm not opposed to that, but was following convention for Printer in keeping ModelLibraryFormatPrinter underneath src/printer, is all.

TextPrinter text_printer_;
};

TVM_REGISTER_GLOBAL("tir.ModelLibraryFormatPrinter")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my comment above about the source code arrangement

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems fine as it works for all TIR -- I just got reminded that we moved MLF to micro for a different reason.

@areusch areusch force-pushed the model-library-format-tvm-build branch from 2957cde to 7b1ef1a Compare June 21, 2021 15:03
@areusch
Copy link
Contributor Author

areusch commented Jun 21, 2021

@giuseros @manupa-arm please take a look, i think I've addressed your comments or replied on thread

Copy link
Contributor

@manupak manupak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack -- we could use this namespace as it works for all TIR.
(It was a bit of puzzle to me why we moved MLF to micro in the first place :) )

I still think having a python class for a python exposed interface is more readable (though is more overhead in-terms of code -- yes it'd be nice if we could have a generation mechanism).

However, we should have a policy as this is a common occurrence in the codebase -- both approaches. @tqchen @jroesch.

return doc.str();
}

PackedFunc GetFunction(const std::string& name, const ObjectPtr<Object>& sptr_to_self) override {
Copy link
Contributor

@manupak manupak Jun 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are going with this approach, I feel we should limit this to just to the lookup ladder of functions.
i.e., its better to implement the lambda functions as separate functions

Moreover, since this is main interface to runtime.Module, I think we should provide documentation of the functions and arguments -- maybe once the functions are seperated out, it could be the documentation of those functions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i agree with that--however, we do need the lambda function to capture sptr_to_self (this mimics the Python descriptor get() implementation). i moved the body into a separate function to align this class for a future world where we implemented the auto-generated interface.

cc @jroesch who has a prototype of the auto-generator

@areusch
Copy link
Contributor Author

areusch commented Jun 29, 2021

@manupa-arm please let me know if there's anything else--i believe your comments are all forward-looking, but want to understand if there are specific changes needed here to merge.

@areusch
Copy link
Contributor Author

areusch commented Jun 29, 2021

@giuseros please take another look and explicitly approve if you're ok with this

Copy link
Contributor

@manupak manupak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes LGTM!.

ps : that generator would be handy :) @jroesch .

@areusch
Copy link
Contributor Author

areusch commented Jun 30, 2021

oops, think someone already took MLF version 3. added a patch to rev to v4.

@giuseros please take a look and explicitly approve if you're ok with this PR!

Copy link
Contributor

@leandron leandron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks @areusch!

@leandron leandron merged commit 970aeff into apache:main Jul 2, 2021
@leandron
Copy link
Contributor

leandron commented Jul 2, 2021

Merged now, thanks @manupa-arm @giuseros @areusch!

ylc pushed a commit to ylc/tvm that referenced this pull request Sep 29, 2021
* rename _update_target and document its function

* make tvm.build return OperatorModule to return multiple outputs

* allow retrieving the var names used in TIR repr

* add Operator Model Library Format and test

* Add pathlib convenience functions to utils.TempDirectory.

* fix tests

* black format

* git-clang-format

* pylint fixes

* add asf header

* change memory map to make more sense, fix tests

* address giuseros comments

* align GetVarName with future TypedPackedFunc

* fix test

* clang-format

* rev model library format to v4 (bad merge)
zxy844288792 pushed a commit to zxy844288792/tvm that referenced this pull request Mar 4, 2022
* rename _update_target and document its function

* make tvm.build return OperatorModule to return multiple outputs

* allow retrieving the var names used in TIR repr

* add Operator Model Library Format and test

* Add pathlib convenience functions to utils.TempDirectory.

* fix tests

* black format

* git-clang-format

* pylint fixes

* add asf header

* change memory map to make more sense, fix tests

* address giuseros comments

* align GetVarName with future TypedPackedFunc

* fix test

* clang-format

* rev model library format to v4 (bad merge)
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants