-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Relay] Remove memory planing from LowerTEPass #8974
Merged
Merged
Changes from 8 commits
Commits
Show all changes
18 commits
Select commit
Hold shift + click to select a range
f50495b
Clean up LowerTEPass
electriclilies 6d00a9c
Fix rebase
electriclilies 3d2ba1b
Remove comment
electriclilies ee52039
[TEC] Remove memory plan from LowerTEPass
mikepapadim c68f446
Merge branch 'main' of https://github.com/apache/tvm into tec_vm
mikepapadim 48cb35d
Merge branch 'main' of https://github.com/apache/tvm into tec_vm
mikepapadim 8a0a3c9
Fix linting errors
mikepapadim 074ed40
Merge branch 'main' of https://github.com/apache/tvm into tec_vm
mikepapadim 9733283
Fix PR comments
mikepapadim 0a1beb6
Remove updated module with function info from LowerTe
mikepapadim 03a208c
Refactor UpdateMainWorkspaceSize to update func info independently fr…
mikepapadim 5b3fb11
Merge branch 'main' of https://github.com/apache/tvm into tec_vm
mikepapadim b965443
Fix aot failed tests
mikepapadim 66168f7
Merge branch 'main' of https://github.com/apache/tvm into tec_vm
mikepapadim 8dfec81
Merge branch 'main' of https://github.com/apache/tvm into tec_vm
mikepapadim c4cc22d
Revert whitespaces fixes
mikepapadim ed2c1cc
Remove obsolete function hoisting and minor cleanups
mikepapadim ce9fbdf
Address PR comments
mikepapadim File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -45,6 +45,13 @@ namespace tvm { | |
namespace relay { | ||
namespace backend { | ||
|
||
struct EnumClassHash { | ||
template <typename T> | ||
std::size_t operator()(T t) const { | ||
return static_cast<std::size_t>(t); | ||
} | ||
}; | ||
|
||
using IntegerArray = Array<Integer>; | ||
using StorageMap = | ||
std::unordered_map<Expr, StorageInfo, runtime::ObjectPtrHash, runtime::ObjectPtrEqual>; | ||
|
@@ -277,6 +284,132 @@ class AOTExecutorCodegen : public MixedModeVisitor { | |
} | ||
} | ||
|
||
/*! | ||
* \brief Update the "main" control function's metadata | ||
* | ||
* \param mod The module | ||
* \param targets Map of targets | ||
* \return function_infos Function info for each function in the module | ||
*/ | ||
|
||
backend::FunctionInfo UpdateMainWorkspaceSize(const IRModule& mod, tec::TargetMap targets, | ||
Map<Expr, backend::StorageInfo> storage_info_map) { | ||
CHECK_EQ(mod->functions.size(), 1) | ||
<< "There should only be one function in the module passed to UpdateMainWorkspaceSize"; | ||
Function func = Downcast<Function>(mod->Lookup("main")); | ||
|
||
// This is a Map<device,Map<storage_id, size>> | ||
std::unordered_map<DLDeviceType, std::unordered_map<int, int>, EnumClassHash> sid_workspace; | ||
// This is a Map<device, size_of_inputs_and_outputs> | ||
std::unordered_map<DLDeviceType, int, EnumClassHash> device_io; | ||
// This is a Map<device, size_of_constants> | ||
std::unordered_map<DLDeviceType, int, EnumClassHash> device_consts; | ||
|
||
// Initialize the mapping from all storage identifiers to workspace sizes, | ||
// the amount of device io, and the device constants. | ||
for (const auto& kv : storage_info_map) { | ||
backend::StorageInfo storage_info = kv.second; | ||
std::vector<int64_t> storage_ids = storage_info->storage_ids; | ||
std::vector<DLDeviceType> devices = storage_info->device_types; | ||
|
||
CHECK_EQ(storage_ids.size(), devices.size()); | ||
for (uint32_t i = 0; i < devices.size(); i++) { | ||
sid_workspace[devices[i]][storage_ids[i]] = 0; | ||
device_io[devices[i]] = 0; | ||
device_consts[devices[i]] = 0; | ||
} | ||
} | ||
|
||
// Iterate the storage map to compute all the tensor sizes in the program. | ||
// There are 3 cases in this code: | ||
// | ||
// First we need to compute the sizes of all | ||
// inline constants. | ||
// | ||
// Second we compute the size of any bound variable as these are input and output | ||
// sizes of the program. | ||
// | ||
// Finally for all other expressions we check which storage identifier they have | ||
// been assigned and we compute the maximal size of the storage, as tensors can | ||
// share storage with other tensors which are the same size or larger. | ||
// | ||
// In this final case there is only one allocation for all tensors which share storage | ||
// which will be the maximal size of all tensors which were assigned to it. | ||
for (const auto& kv : storage_info_map) { | ||
Expr expr = kv.first; | ||
int64_t size_bytes = backend::CalculateRelayExprSizeBytes(expr->checked_type()); | ||
backend::StorageInfo storage_info = kv.second; | ||
std::vector<int64_t> storage_ids = storage_info->storage_ids; | ||
std::vector<DLDeviceType> devices = storage_info->device_types; | ||
|
||
if (expr->IsInstance<ConstantNode>()) { | ||
for (const auto& dev : devices) { | ||
device_consts[dev] += size_bytes; | ||
} | ||
continue; | ||
} else if (expr->IsInstance<VarNode>() || expr.same_as(func->body)) { | ||
CHECK_GE(devices.size(), 1) << "must be at least one device"; | ||
for (const auto& dev : devices) { | ||
device_io[dev] += size_bytes; | ||
} | ||
continue; | ||
} | ||
|
||
// TODO(@electriclilies): This code is never being called which means sid_workspace is not | ||
// updated.. This means that storage info is probably not being created correctly. Or is not | ||
// equivalent to what was here previously | ||
for (uint32_t i = 0; i < storage_ids.size(); i++) { | ||
// Here we record the largest size of the tensor | ||
// that share the same storage id, because storage_id will | ||
// be shared between multiple tensors that are not live simultaneously. | ||
if (size_bytes > sid_workspace[devices[i]][storage_ids[i]]) { | ||
sid_workspace[devices[i]][storage_ids[i]] = size_bytes; | ||
} | ||
} | ||
} | ||
|
||
// This is a Map<device, workspace_size> | ||
std::unordered_map<DLDeviceType, int, EnumClassHash> device_workspace; | ||
// Once we know the sizes of sids, we need to accumulate per device | ||
for (const auto& dev_sid_size : sid_workspace) { | ||
auto dev = dev_sid_size.first; | ||
device_workspace[dev] = 0; | ||
for (const auto& sid_size : dev_sid_size.second) { | ||
device_workspace[dev] += sid_size.second; | ||
} | ||
} | ||
|
||
Map<Target, Integer> workspace_sizes; | ||
Map<Target, Integer> io_sizes; | ||
Map<Target, Integer> constant_sizes; | ||
Map<Target, tir::PrimFunc> tir_primfuncs; | ||
Map<Target, Function> relay_primfuncs; | ||
|
||
// Initialize all target workspaces to zero | ||
for (const auto& kv : targets) { | ||
auto tgt = kv.second; | ||
workspace_sizes.Set(tgt, 0); | ||
} | ||
|
||
for (const auto& dev_and_size : device_workspace) { | ||
auto tgt = tec::GetTargetFromInteger(dev_and_size.first, targets); | ||
workspace_sizes.Set(tgt, dev_and_size.second); | ||
relay_primfuncs.Set(tgt, func); | ||
} | ||
for (const auto& dev_and_size : device_io) { | ||
auto tgt = tec::GetTargetFromInteger(dev_and_size.first, targets); | ||
io_sizes.Set(tgt, dev_and_size.second); | ||
} | ||
|
||
for (const auto& dev_and_size : device_consts) { | ||
auto tgt = tec::GetTargetFromInteger(dev_and_size.first, targets); | ||
constant_sizes.Set(tgt, dev_and_size.second); | ||
} | ||
|
||
return backend::FunctionInfo(workspace_sizes, io_sizes, constant_sizes, tir_primfuncs, | ||
relay_primfuncs); | ||
} | ||
|
||
/*! | ||
* brief Call a function with a given name | ||
*/ | ||
|
@@ -583,8 +716,15 @@ class AOTExecutorCodegen : public MixedModeVisitor { | |
// performing the preexisting AOT executor code generation phase. | ||
IRModule mod = IRModule::FromExpr(func); | ||
|
||
backend::FunctionInfo func_info; | ||
|
||
if (memory_plan.defined()) { | ||
// TODO(@electriclilies, @jroesch): remove UpdateMainWorkspaceSize | ||
func_info = UpdateMainWorkspaceSize(mod, targets_, memory_plan->expr_to_storage_info); | ||
} | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can you just put the |
||
|
||
IRModule lowered_mod = | ||
LowerTEPass(targets_, device_context_map, memory_plan, mod_name, [this](Function func) { | ||
LowerTEPass(targets_, device_context_map, mod_name, [this](Function func) { | ||
// We need to maintain the constant map for external | ||
// functions so we pass this processing function which | ||
// allows us to process each function as we lower it. | ||
|
@@ -676,7 +816,7 @@ class AOTExecutorCodegen : public MixedModeVisitor { | |
|
||
Optional<Array<tvm::runtime::Module>> external_modules = | ||
lowered_mod->GetAttr<Array<tvm::runtime::Module>>("external_mods"); | ||
ICHECK(external_modules) << "Attribute \"external_modules\" should be set at this point."; | ||
ICHECK(external_modules) << "Attribute \"external_mods\" should be set at this point."; | ||
|
||
// This is the point where we separate the functions in the module by target | ||
ret.lowered_funcs = tec::GetPerTargetModules(lowered_mod); | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you EnumClassHash into "utils.h" and remove this definition of it and the definition in the te compiler?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done