Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add FLAGS_allow_cinn_ops & FLAGS_deny_cinn_ops for controlling op types used in training with CINN. #36842

Merged
merged 18 commits into from
Nov 3, 2021

Conversation

wzzju
Copy link
Contributor

@wzzju wzzju commented Oct 28, 2021

PR types

Others

PR changes

Others

Describe

  • 增加FLAGS_allow_cinn_opsFLAGS_deny_cinn_ops两个flag,用于控制Paddle训练中使用CINN算子代替原生算子的种类。
  • 使用RWLock,完善CinnCompiler类在多线程运行环境下的正确性(主要关于对cache_的操作)。
  • CinnCompiler类增加VizGraphReadableKey方法,用于打印子图的dot信息以及获取具备可读性的CompilationKey。
  • test_parallel_executor_run_cinn.py单测中增加对以下两种情况中add+relu模型loss值的比较:
    • 使用内嵌CINN编译功能的Paddle训练add+relu模型
    • 使用原生Paddle训练add+relu模型

@paddle-bot-old
Copy link

Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

@wzzju wzzju changed the title Compare add+relu loss values (CINN). Compare add+relu loss values (CINN Test). Oct 28, 2021
cinn_losses = train(self.tmpdir, "paddle")
set_cinn_flag(False)
pd_losses = train(self.tmpdir, "cinn")
np.allclose(cinn_losses, pd_losses)
Copy link
Contributor

@thisjiang thisjiang Oct 28, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
np.allclose(cinn_losses, pd_losses)
self.assertTrue(np.allclose(cinn_losses, pd_losses, atol=1e-5))

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@wzzju wzzju changed the title Compare add+relu loss values (CINN Test). Add FLAGS_allow_cinn_ops & FLAGS_deny_cinn_ops for controlling op types used in training with CINN. Nov 2, 2021
if (!graphs_.count(graph_key)) {
graphs_[graph_key] = std::move(graph);
} else {
LOG(WARNING)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

会出现子图已经被注册的情况吗

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这种情况不确定,如果只是运用build_cinn_pass一次,应该不会出现,这里已改为使用PADDLE_ENFORCE

@@ -59,40 +61,60 @@ std::string CinnCompiler::AddGraph(std::unique_ptr<Graph> graph) {
ProgramDesc program;
GraphToProgram(*graph, &program);
program.Proto()->SerializeToString(&graph_key);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

序列化的字符串直接作为key比较冗余,查找效率低、还占空间。是否以其hash值作为key,CinnCompiler额外提供接口可以由key获取其子图的序列化字符串?

Copy link
Contributor Author

@wzzju wzzju Nov 2, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

std::unordered_map<std::string, std::unique_ptr<ir::Graph>> graphs_;

std::string作为std::unordered_map的key,查找时就是使用string的hash code查找的,你说不存储这个字符串,是想通过hash code回转得到愿字符串吗?这个恐怕是做不到的face_69@2x

Copy link
Contributor

@CtfGo CtfGo Nov 3, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

不是,查找时是string->hash吧,我意思是key存hash code,CinnCompiler提供接口由key获取graph->debug string替代ReadableProtoStr,不过这不重要。

Xreki
Xreki previously approved these changes Nov 3, 2021
Copy link
Contributor

@Xreki Xreki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM for the use of const_cast and op benchmark ci

auto* cinn_compiler = CinnCompiler::GetInstance();
const auto& compiling_graph = cinn_compiler->FindGraph(compilation_key);
// viz_graph("compiling_graph.dot", const_cast<Graph*>(&compiling_graph));
viz_graph("compiling_graph.dot", const_cast<Graph*>(&compiling_graph));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个测试生成的文件是不是应该注释掉?或者在函数最后删掉?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

方便后续调试,先不删除,ci上build目录生成的文件会被自动清理的。

}
// if the op type is registered in CINN and deny_ops is not empty, return
// true only when it is not in deny_ops
auto deny_ops = StringSplit(FLAGS_deny_cinn_ops, kDelim);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

同上

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

node->Name()) != nullptr;
// if the op type is registered in CINN and allow_ops is not empty, return
// true only when it is in allow_ops
auto allow_ops = StringSplit(FLAGS_allow_cinn_ops, kDelim);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个可以放在外面吧?就不用每次都计算了

Suggested change
auto allow_ops = StringSplit(FLAGS_allow_cinn_ops, kDelim);
auto allow_ops = StringSplit(FLAGS_allow_cinn_ops, kDelim);
auto teller = [&allow_ops](const Node* node) { ... };

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

lanxianghit
lanxianghit previously approved these changes Nov 3, 2021
thisjiang
thisjiang previously approved these changes Nov 3, 2021
Copy link
Contributor

@thisjiang thisjiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

CtfGo
CtfGo previously approved these changes Nov 3, 2021
@wzzju wzzju dismissed stale reviews from CtfGo, thisjiang, and lanxianghit via 0f86f5f November 3, 2021 03:31
Xreki
Xreki previously approved these changes Nov 3, 2021
Copy link
Contributor

@thisjiang thisjiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@wzzju wzzju merged commit 2479664 into PaddlePaddle:develop Nov 3, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants