Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Relay][VM] Add AllocTensor instruction and better instruction printer #3306

Merged
merged 9 commits into from
Jun 14, 2019

Conversation

icemelon
Copy link
Member

@icemelon icemelon commented Jun 7, 2019

  1. I changed the current AllocTensor instruction to AllocTensorReg instruction for dynamic shape allocation, and add AllocTensor instruction for const shape. Current every AllocTensor requires a LoadConst instruction before, which I think significantly increases the number of instructions and reduces the readability. Therefore I think it's better to have both AllocTensor and AllocTensorReg instruction.
  2. I updated the VM instruction printer so that it's easier to understand.
  3. Fix InvokePacked support for tuple type input.

cc: @jroesch @wweic @zhiics @tqchen

src/relay/backend/vm/compiler.cc Outdated Show resolved Hide resolved
runtime::TVMArgsSetter setter(values.data(), codes.data());
size_t arity = 0;
for (Index i = 0; i < arg_count; i++) {
if (args[i].ptr_->tag == ObjectTag::kDatatype) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When will this happen?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ops like concatenate take in a tuple of tensors. And fusion sometimes can make a fused function take a tuple as input.

Copy link
Member

@zhiics zhiics left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@icemelon
Copy link
Member Author

@jroesch @tqchen Could you help review this PR?

@jroesch
Copy link
Member

jroesch commented Jun 13, 2019

Sorry for the slow review Haichen, could you provide an example of the updated instruction printer? my only concern is I'm not sure if we should introduce duplicate instructions for readability wins, should we really differentiate static vs. non-static allocation.

I am not a 100% sure this matches the design I had in mind for the memory manifestation and optimization. We could just do this for now, and later change it when the memory ships but I was thinking we should introduce a low level concept of storage independent from allocation of tensors.

For example:

fn @f(%x, %y) {
  let z = %x + %y;
  ...
}
fn @f(%x, %y) {
  let sto = alloc_storage(..);
  let out1 = alloc_tensor(sto, ...);
  add(%x, %y, %out1);
  ...
}

I guess we could have two low-level instruction variants which take storage, dtype and shape but in general I don't like specializing/duplicating as it increases the number of potential code paths, and the allocation mechanism should have no real differences afaict. The rest of the changes look good to me.

@icemelon
Copy link
Member Author

@jroesch I don't have a strong argument on adding new static tensor allocation. But I don't see adding this new instruction will cause any overhead either. I suggest that we can have this static alloc instruction for now, and later change it to alloc tensor from storage. I can also reserve an opcode for alloc storage instruction. What do you think?

I'll add examples of instruction printer.

@icemelon
Copy link
Member Author

Updated instruction printer outputs:
Move: move $2 $1
Return: ret $3
Alloc tensor: alloc_tensor $1 [1, 3, 224, 224] float32
Alloc tensor with regisger: alloc_tensor_reg $2 $1 float32
Alloc datatype: alloc_data $3 tag(0) [$1, $2]
Alloc closure: alloc_closure $4 VMFunc[3]($1, $2)
Load const: load_const $1 Const[1]
Get field: get_field $2 $1[2]
Invoke VM function: invoke $4 VMFunc[4]($1, $2, $3)
Invoke packed function: invoked_packed PackedFunc[1](in: $1, $2, out: $3, $4)
Invoke closure: invoke_closure $4 $1($2, $3)
If: if $2 0 10
Goto: goto 2
Select: select $4 $1 $2 $3

@jroesch
Copy link
Member

jroesch commented Jun 14, 2019

Okay looks good, I plan on moving back to VM after tutorial

@jroesch jroesch merged commit b8fa8f6 into apache:master Jun 14, 2019
@icemelon icemelon deleted the vm-print branch June 14, 2019 22:19
wweic pushed a commit to wweic/tvm that referenced this pull request Jun 26, 2019
apache#3306)

* Update vm print & add AllocTensor instruction

* patch

* fix invoke packed

* update cmake

* tweak move

* update invoke_closure

* lint

* add doc

* tweak
wweic pushed a commit to neo-ai/tvm that referenced this pull request Jun 27, 2019
apache#3306)

* Update vm print & add AllocTensor instruction

* patch

* fix invoke packed

* update cmake

* tweak move

* update invoke_closure

* lint

* add doc

* tweak
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants