-
Notifications
You must be signed in to change notification settings - Fork 594
Add APIs to control packing/repacking algorithm and change instance resources #2059
base: master
Are you sure you want to change the base?
Add APIs to control packing/repacking algorithm and change instance resources #2059
Conversation
These packing.algorithm configs are set in yaml as well. What happens if config in yaml conflicts with setting in topology writter's code? |
This argument sounds strange to me. If this needs to be true all the time, how shall we change packing algorithm? The only way is to pass |
@huijunw these are not system or component level configs as defined on that page. They're not set in |
* In bytes. | ||
*/ | ||
public static final String TOPOLOGY_INSTANCE_RAM_REQUESTED = | ||
"heron.resources.instance.ram"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please append with .bytes
to make units explicit. same for disk settings.
conf.put(Config.TOPOLOGY_INSTANCE_CPU_REQUESTED, Float.toString(ncpus)); | ||
} | ||
|
||
public static void setInstanceDiskRequested(Map<String, Object> conf, ByteAmount nbytes) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/nbytes/byteAmount/g here and below.
conf.put(Config.TOPOLOGY_REPACKING_ALGORITHM, type.toString()); | ||
} | ||
|
||
public static void setInstanceCpuRequested(Map<String, Object> conf, float ncpus) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/ncpus/cpus/g here and below
@@ -452,6 +487,28 @@ public static void setTopologyExactlyOnceEnabled(Map<String, Object> conf, boole | |||
conf.put(Config.TOPOLOGY_EXACTLYONCE_ENABLED, String.valueOf(exactOnce)); | |||
} | |||
|
|||
public static void setTopologyPackingAlgorithm(Map<String, Object> conf, | |||
PackingAlgorithmType type) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The packing type is just a Class. It needs to be extensible so users can implement their own so an enum won't work.
public static void setTopologyPackingClass(Map<String, Object> conf,
Class<? extends IPacking> packingClass);
public static void setTopologyRepackingClass(Map<String, Object> conf,
Class<? extends IRepacking> repackingClass);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was worried if doing this needs to pull in lots of dependencies from schedulers. But I can try though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would cause a problem, since IPacking and IRepacking are in spi, which depends on api, but not the other way around. We could considering moving these interfaces into api. @kramasamy do you have an opinion on this one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right. I was not clear in my message above: yes, I was worried about circular dependencies as well when reading BUILD files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding packaging of IPacking
and IRepacking
, is the distribution of interfaces between api
and spi
packages needed? Given the current structure, moving these interfaces may create confusion. These interfaces are closely linked to other components like IScheduler
.
We could avoid future movements of interfaces used by users by keeping all the interfaces implemented by custom code in api package?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the intent (which makes sense to me) is that api
is where topology-level APIs are that topology developers might implement or specify (i.e., IBolt, ISpout, etc). The spi
package is where system-level APIs live that would be implemented by administrating teams for a new scheduler or state manager for example.
I think the packing algorithms could be implemented or at least specified by topology authors at the topology level. Hence it makes sense for those interfaces to move to api
, but not others perse.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @billonahill I agree with the interpretation of the api
and spi
packages.
I think packing configuration is rarely provided by topology developer. It is rather commonly specified by the administrator or the member deploying the topology via yaml
config files or command line arguments. To me this fits the definition of system-level API. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the packing impls should be defaulted at the system level, but I certainly see the need to be able to override them at the topology level. We've recommended to some topology owners to specify a different packing impl to change the way resources are allocated for their specific job, which seems reasonable to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One major feedback from my side.
Heron API should be self contained. It should not be dependent on any other module. I already see dependency on heron.common. This must be removed
@srkukarni that dep already exists, so removing it should not be in scope for this issue, but this patch should not make it worse. Opened PR #2061 to remove dep on |
@billonahill This pr makes it worse by adding more dep from common. |
@srkukarni which I'm saying it shouldn't do. :) That probably wasn't clear in my wording above... |
/** | ||
* Packing algorithm used to calculate packing plan | ||
*/ | ||
public static final String TOPOLOGY_PACKING_ALGORITHM = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really doubt if we need the packing algorithm to be configurable by users. By the original design, it's a system wide property which should only be set by the heron administrator.
Also, how are the ram
, cpu
and disk
resource configs used? It seems these config doesn't tell the difference for different components. It applies all components' instances for a topology. Correct me if my understanding is wrong here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could see having multiple packing implementation supported, where power users could specify a non-default based on their needs. Or even write one of their own.
It seems that we are not going to reach an agreement here soon. I have asked @ttim to add |
In Twitter, customers wants the ability to change packing algorithm, and tune instance ram usage without passing
--config-property
to Heron client. This PR adds API to change them.Yes, the
PackingAlgorithmType.java
andRepackingAlgorithmType.java
are pretty ugly. Please free feel to advise better solutions.