-
Notifications
You must be signed in to change notification settings - Fork 6.8k
Conversation
# KIND, either express or implied. See the License for the | ||
# specific language governing permissions and limitations | ||
# under the License. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you write a simple README for this file? How to run this test? How can users play with this test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can use the following command to try it. Please remember to change ctx
passed to the function score
into mx.cpu()
if you are running on a cpu. It's just a preliminary end-to-end run-through for proof of concept. More changes are coming.
python --model=imagenet1k-resnet-152 --dataset=data --num-inference-batches=10
sym, arg_params, aux_params = mx.model.load_checkpoint(prefix, epoch) | ||
op_names = ['BatchNorm', 'Convolution', 'Pooling', 'Activation'] | ||
out = SymbolHandle() | ||
check_call(_LIB.MXPartitionGraph(sym.handle, mx_uint(len(op_names)), c_str_array(op_names), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about wrapping this function into Symbol class?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We will hide this from users. The API design is still under discussion.
kCrossDeviceCopy, | ||
/*! | ||
* A subgraph execution should happen in the main thread, instead of | ||
* in the execution engine. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
operators inside a subgraph are still executed in the execution engine, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes.
@@ -33,8 +33,12 @@ | |||
namespace mxnet { | |||
namespace engine { | |||
|
|||
#if 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for debug?
src/engine/naive_engine.cc
Outdated
|
||
namespace mxnet { | ||
namespace engine { | ||
|
||
class NaiveVar final |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please add a document for this class.
@@ -71,8 +81,11 @@ class NaiveEngine final : public Engine { | |||
|
|||
// new variables | |||
VarHandle NewVariable() override { | |||
return NaiveVar::New(); | |||
#if 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for debug?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes.
Great, this will be very helpful for us 👍 |
if (!op_name_set.empty()) { | ||
mxnet::op::SubgraphPropertyPtr property | ||
= std::make_shared<mxnet::op::DefaultSubgraphProperty>(op_name_set); | ||
g.attrs["subgraph_property"] = std::make_shared<nnvm::any>(std::move(property)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible for mxnet to enable several acceleration backends at the same time? So then there will be several different subgraph_property
for different backends respectively?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not exactly. Your implementation of SubgraphProperty determines what subgraph op node is used and consequently, what Forward/Backward implementation is adopted. If you want to support multiple backend at the same time, you just need to configure that in the implementation of Forward/Backward. The question is, how do you determine when to use which backend?
|
||
// filter out unqualified pre-selected nodes | ||
std::vector<nnvm::Node*> filtered_nodes = preselected_nodes; | ||
subgraph_selector->Filter(g, &filtered_nodes); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we have Filter to return a vector of Nodes that are selected?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree, it's easier for users to implement this function. I will make the change.
} | ||
// find out subgraphs from the filtered nodes | ||
std::vector<std::vector<SimpleNode*>> subgraphs; | ||
PostProcessNodeCandidates(*g, filtered_nodes, simple_nodes, &subgraphs, &subgraph_id); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we require the nodes returned from Filter belong to a single subgraph?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure about this part. If users only care about the network pattern matching, requiring it to return a single subgraph might be too strict.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean finding a single subgraph from each node. The algorithm starts to search for subgraphs from every node. It should be sufficient to find at most one subgraph from every node.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. That seems clearer. I will change the logic of getting multiple subgraphs from the filtered nodes into getting a single subgraph starting from the seed node passed to PreSelectSubgraphNodes
.
Fix
* add functions for cutting edges. * construct subgraphs. * generalize graph partition. * restructure the code. * create SubgraphOpState. * register subgraph property. * rename. * address comments. * update select API. * rename. * set subgraph property. * fix bugs. * fix bugs.
* remove SubgraphOperator. * change the interface of SubgraphSelector.
* update tests. * fix shape/dtype/storage inference. * fix.
* execute subgraph operators correctly. * add HasSubgraph attr. * support subgraph op in imperative. * rewrite running subgraph in main thread. * execute subgraph op in imperative. * remove need_grad.
src/operator/subgraph/common.h
Outdated
exec::DevMaskVector dev_masks(idx_g.num_node_entries(), dev_mask); | ||
|
||
// Put the input and output storages to the storage vector. | ||
nnvm::StorageVector stypes(idx_g.num_node_entries(), exec::kBadStorageID); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nnvm::StorageVector is used for plan_memory, not for storage type inference
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. Fixed.
Graph partitioner and subgraph op
Graph partitioner and subgraph op
Graph partitioner and subgraph op Fix duplicate entry bugs (apache#11767) Make subgraph var node name unique (apache#11876) [DO NOT REVIEW] Fix bug of eliminating cycles (apache#11907) * Fix cycle bug * Fix decycle bug * Fix comment [DO NOT REVIEW] Subgraph API (apache#12104) * Initial commit * Add unit tests * Fix lint * Fix lint * Clean up * Add graph partitiong to Bind * Add property name to graph partitioning c api * Fix unit test gpu context * Address cr * Move subgraph to attrs.subgraphs and fix the example * Fix lint * Add var version unit test * Address cr * Enable unit test that was flaky
Graph partitioner and subgraph op Fix duplicate entry bugs (apache#11767) Make subgraph var node name unique (apache#11876) [DO NOT REVIEW] Fix bug of eliminating cycles (apache#11907) * Fix cycle bug * Fix decycle bug * Fix comment [DO NOT REVIEW] Subgraph API (apache#12104) * Initial commit * Add unit tests * Fix lint * Fix lint * Clean up * Add graph partitiong to Bind * Add property name to graph partitioning c api * Fix unit test gpu context * Address cr * Move subgraph to attrs.subgraphs and fix the example * Fix lint * Add var version unit test * Address cr * Enable unit test that was flaky
Graph partitioner and subgraph op Fix duplicate entry bugs (apache#11767) Make subgraph var node name unique (apache#11876) [DO NOT REVIEW] Fix bug of eliminating cycles (apache#11907) * Fix cycle bug * Fix decycle bug * Fix comment [DO NOT REVIEW] Subgraph API (apache#12104) * Initial commit * Add unit tests * Fix lint * Fix lint * Clean up * Add graph partitiong to Bind * Add property name to graph partitioning c api * Fix unit test gpu context * Address cr * Move subgraph to attrs.subgraphs and fix the example * Fix lint * Add var version unit test * Address cr * Enable unit test that was flaky
Graph partitioner and subgraph op Fix duplicate entry bugs (apache#11767) Make subgraph var node name unique (apache#11876) [DO NOT REVIEW] Fix bug of eliminating cycles (apache#11907) * Fix cycle bug * Fix decycle bug * Fix comment [DO NOT REVIEW] Subgraph API (apache#12104) * Initial commit * Add unit tests * Fix lint * Fix lint * Clean up * Add graph partitiong to Bind * Add property name to graph partitioning c api * Fix unit test gpu context * Address cr * Move subgraph to attrs.subgraphs and fix the example * Fix lint * Add var version unit test * Address cr * Enable unit test that was flaky
Graph partitioner and subgraph op Fix duplicate entry bugs (apache#11767) Make subgraph var node name unique (apache#11876) [DO NOT REVIEW] Fix bug of eliminating cycles (apache#11907) * Fix cycle bug * Fix decycle bug * Fix comment [DO NOT REVIEW] Subgraph API (apache#12104) * Initial commit * Add unit tests * Fix lint * Fix lint * Clean up * Add graph partitiong to Bind * Add property name to graph partitioning c api * Fix unit test gpu context * Address cr * Move subgraph to attrs.subgraphs and fix the example * Fix lint * Add var version unit test * Address cr * Enable unit test that was flaky
Graph partitioner and subgraph op Fix duplicate entry bugs (apache#11767) Make subgraph var node name unique (apache#11876) [DO NOT REVIEW] Fix bug of eliminating cycles (apache#11907) * Fix cycle bug * Fix decycle bug * Fix comment [DO NOT REVIEW] Subgraph API (apache#12104) * Initial commit * Add unit tests * Fix lint * Fix lint * Clean up * Add graph partitiong to Bind * Add property name to graph partitioning c api * Fix unit test gpu context * Address cr * Move subgraph to attrs.subgraphs and fix the example * Fix lint * Add var version unit test * Address cr * Enable unit test that was flaky
* Graph partitioner and subgraph op (#11251) Graph partitioner and subgraph op Fix duplicate entry bugs (#11767) Make subgraph var node name unique (#11876) [DO NOT REVIEW] Fix bug of eliminating cycles (#11907) * Fix cycle bug * Fix decycle bug * Fix comment [DO NOT REVIEW] Subgraph API (#12104) * Initial commit * Add unit tests * Fix lint * Fix lint * Clean up * Add graph partitiong to Bind * Add property name to graph partitioning c api * Fix unit test gpu context * Address cr * Move subgraph to attrs.subgraphs and fix the example * Fix lint * Add var version unit test * Address cr * Enable unit test that was flaky * Clean up * Clean up * Clean up * Change version return type in NDArray * Clean up * Add register or get for subgraph prop registry * Address cr * Remove unnecessary code * Handle var version issue in naive engine * Delete example * Remove registration of resource request for default subgraph op * Add doc string * Improve doc string
* Graph partitioner and subgraph op (apache#11251) Graph partitioner and subgraph op Fix duplicate entry bugs (apache#11767) Make subgraph var node name unique (apache#11876) [DO NOT REVIEW] Fix bug of eliminating cycles (apache#11907) * Fix cycle bug * Fix decycle bug * Fix comment [DO NOT REVIEW] Subgraph API (apache#12104) * Initial commit * Add unit tests * Fix lint * Fix lint * Clean up * Add graph partitiong to Bind * Add property name to graph partitioning c api * Fix unit test gpu context * Address cr * Move subgraph to attrs.subgraphs and fix the example * Fix lint * Add var version unit test * Address cr * Enable unit test that was flaky * Clean up * Clean up * Clean up * Change version return type in NDArray * Clean up * Add register or get for subgraph prop registry * Address cr * Remove unnecessary code * Handle var version issue in naive engine * Delete example * Remove registration of resource request for default subgraph op * Add doc string * Improve doc string
* Graph partitioner and subgraph op (apache#11251) Graph partitioner and subgraph op Fix duplicate entry bugs (apache#11767) Make subgraph var node name unique (apache#11876) [DO NOT REVIEW] Fix bug of eliminating cycles (apache#11907) * Fix cycle bug * Fix decycle bug * Fix comment [DO NOT REVIEW] Subgraph API (apache#12104) * Initial commit * Add unit tests * Fix lint * Fix lint * Clean up * Add graph partitiong to Bind * Add property name to graph partitioning c api * Fix unit test gpu context * Address cr * Move subgraph to attrs.subgraphs and fix the example * Fix lint * Add var version unit test * Address cr * Enable unit test that was flaky * Clean up * Clean up * Clean up * Change version return type in NDArray * Clean up * Add register or get for subgraph prop registry * Address cr * Remove unnecessary code * Handle var version issue in naive engine * Delete example * Remove registration of resource request for default subgraph op * Add doc string * Improve doc string
Description
(Brief description on what this PR is about)
Checklist
Essentials
Please feel free to remove inapplicable items for your PR.
Changes
Comments
@piiswrong @zheng-da