Skip to content
This repository has been archived by the owner on Mar 3, 2023. It is now read-only.

Add APIs to control packing/repacking algorithm and change instance resources #2059

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

objmagic
Copy link
Contributor

In Twitter, customers wants the ability to change packing algorithm, and tune instance ram usage without passing --config-property to Heron client. This PR adds API to change them.

Yes, the PackingAlgorithmType.java and RepackingAlgorithmType.java are pretty ugly. Please free feel to advise better solutions.

@objmagic objmagic requested a review from billonahill July 13, 2017 20:50
@huijunwu
Copy link
Member

huijunwu commented Jul 13, 2017

These packing.algorithm configs are set in yaml as well. What happens if config in yaml conflicts with setting in topology writter's code?
Does this PR violate 'Neither system- nor component-level configurations can be overridden by topology developers.' in document https://twitter.github.io/heron/docs/operators/configuration/config-intro/

@huijunw huijunw requested review from srkukarni and maosongfu July 13, 2017 21:17
@objmagic
Copy link
Contributor Author

This argument sounds strange to me. If this needs to be true all the time, how shall we change packing algorithm? The only way is to pass --config-property?

@billonahill
Copy link
Contributor

@huijunw these are not system or component level configs as defined on that page. They're not set in heron_internal.yaml. These should be defaulted but overridable by the topology.

* In bytes.
*/
public static final String TOPOLOGY_INSTANCE_RAM_REQUESTED =
"heron.resources.instance.ram";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please append with .bytes to make units explicit. same for disk settings.

conf.put(Config.TOPOLOGY_INSTANCE_CPU_REQUESTED, Float.toString(ncpus));
}

public static void setInstanceDiskRequested(Map<String, Object> conf, ByteAmount nbytes) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/nbytes/byteAmount/g here and below.

conf.put(Config.TOPOLOGY_REPACKING_ALGORITHM, type.toString());
}

public static void setInstanceCpuRequested(Map<String, Object> conf, float ncpus) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/ncpus/cpus/g here and below

@@ -452,6 +487,28 @@ public static void setTopologyExactlyOnceEnabled(Map<String, Object> conf, boole
conf.put(Config.TOPOLOGY_EXACTLYONCE_ENABLED, String.valueOf(exactOnce));
}

public static void setTopologyPackingAlgorithm(Map<String, Object> conf,
PackingAlgorithmType type) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The packing type is just a Class. It needs to be extensible so users can implement their own so an enum won't work.

public static void setTopologyPackingClass(Map<String, Object> conf,
    Class<? extends IPacking> packingClass);
public static void setTopologyRepackingClass(Map<String, Object> conf,
    Class<? extends IRepacking> repackingClass);

Copy link
Contributor Author

@objmagic objmagic Jul 13, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was worried if doing this needs to pull in lots of dependencies from schedulers. But I can try though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would cause a problem, since IPacking and IRepacking are in spi, which depends on api, but not the other way around. We could considering moving these interfaces into api. @kramasamy do you have an opinion on this one?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right. I was not clear in my message above: yes, I was worried about circular dependencies as well when reading BUILD files.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding packaging of IPacking and IRepacking, is the distribution of interfaces between api and spi packages needed? Given the current structure, moving these interfaces may create confusion. These interfaces are closely linked to other components like IScheduler.

We could avoid future movements of interfaces used by users by keeping all the interfaces implemented by custom code in api package?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the intent (which makes sense to me) is that api is where topology-level APIs are that topology developers might implement or specify (i.e., IBolt, ISpout, etc). The spi package is where system-level APIs live that would be implemented by administrating teams for a new scheduler or state manager for example.

I think the packing algorithms could be implemented or at least specified by topology authors at the topology level. Hence it makes sense for those interfaces to move to api, but not others perse.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @billonahill I agree with the interpretation of the api and spi packages.

I think packing configuration is rarely provided by topology developer. It is rather commonly specified by the administrator or the member deploying the topology via yaml config files or command line arguments. To me this fits the definition of system-level API. WDYT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the packing impls should be defaulted at the system level, but I certainly see the need to be able to override them at the topology level. We've recommended to some topology owners to specify a different packing impl to change the way resources are allocated for their specific job, which seems reasonable to me.

Copy link
Contributor

@srkukarni srkukarni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One major feedback from my side.
Heron API should be self contained. It should not be dependent on any other module. I already see dependency on heron.common. This must be removed

@billonahill
Copy link
Contributor

billonahill commented Jul 13, 2017

@srkukarni that dep already exists, so removing it should not be in scope for this issue, but this patch should not make it worse. Opened PR #2061 to remove dep on TypeUtils and #2062 to move ByteAmount into API.

@srkukarni
Copy link
Contributor

@billonahill This pr makes it worse by adding more dep from common.

@billonahill
Copy link
Contributor

@srkukarni which I'm saying it shouldn't do. :) That probably wasn't clear in my wording above...

@objmagic
Copy link
Contributor Author

depends on #2061 #2066

/**
* Packing algorithm used to calculate packing plan
*/
public static final String TOPOLOGY_PACKING_ALGORITHM =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really doubt if we need the packing algorithm to be configurable by users. By the original design, it's a system wide property which should only be set by the heron administrator.

Also, how are the ram, cpu and disk resource configs used? It seems these config doesn't tell the difference for different components. It applies all components' instances for a topology. Correct me if my understanding is wrong here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could see having multiple packing implementation supported, where power users could specify a non-default based on their needs. Or even write one of their own.

@objmagic
Copy link
Contributor Author

It seems that we are not going to reach an agreement here soon. I have asked @ttim to add --config-property support in tsar config.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants