Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GOBBLIN-2174] GoT YarnService Integration with DynamicScaling #4077

Open
wants to merge 17 commits into
base: master
Choose a base branch
from

Conversation

Blazer-007
Copy link
Contributor

@Blazer-007 Blazer-007 commented Nov 23, 2024

Dear Gobblin maintainers,

Please accept this PR. I understand that it will not be reviewed until I have checked off all the steps below!

JIRA

☑️ My PR addresses the following Gobblin JIRA issues and references them in the PR title. For example, "[GOBBLIN-XXX] My Gobblin PR"
- https://issues.apache.org/jira/browse/GOBBLIN-2174

Description

☑️ Here are some details about my PR, including screenshots (if applicable):

Tests

☑️ My PR adds the following unit tests OR does not need testing for this extremely good reason:

  • Tested E2E using newly created DummyDynamicScalingYarnServiceManager which uses DummyScalingDirectiveSource which returns set of fixed profiles and their set points
  • YarnServiceTest
    • testBaselineWorkerProfileCreatedWithPassedConfigs and testBuildContainerCommand
  • DynamicScalingYarnServiceManagerTest

Logs from container

2024-11-29 05:22:33 PST [,,] INFO  [DynamicScalingYarnService STARTING] org.apache.gobblin.temporal.yarn.YarnService  - Requesting initial containers using baselineWorkerProfile
2024-11-29 05:22:33 PST [,,] INFO  [DynamicScalingYarnService STARTING] org.apache.gobblin.temporal.yarn.YarnService  - Requesting 2 containers with resource=<memory:8192, vCores:2> and allocation request id = Optional.of(0)
2024-11-29 05:22:59 PST [,,] INFO  [DynamicScalingExecutor] org.apache.gobblin.temporal.yarn.DynamicScalingYarnService  - Requesting 2 new containers for profile secondProfile having currently 0 containers
2024-11-29 05:22:59 PST [,,] INFO  [DynamicScalingExecutor] org.apache.gobblin.temporal.yarn.YarnService  - Requesting 2 containers with resource=<memory:2048, vCores:2> and allocation request id = Optional.of(1)
2024-11-29 05:22:59 PST [,,] INFO  [DynamicScalingExecutor] org.apache.gobblin.temporal.yarn.DynamicScalingYarnService  - Requesting 3 new containers for profile firstProfile having currently 0 containers
2024-11-29 05:22:59 PST [,,] INFO  [DynamicScalingExecutor] org.apache.gobblin.temporal.yarn.YarnService  - Requesting 3 containers with resource=<memory:2048, vCores:2> and allocation request id = Optional.of(2)
...
...
2024-11-29 05:23:59 PST [,,] INFO  [DynamicScalingExecutor] org.apache.gobblin.temporal.yarn.DynamicScalingYarnService  - Requesting 1 new containers for profile secondProfile having currently 2 containers
2024-11-29 05:23:59 PST [,,] INFO  [DynamicScalingExecutor] org.apache.gobblin.temporal.yarn.YarnService  - Requesting 1 containers with resource=<memory:2048, vCores:2> and allocation request id = Optional.of(3)
2024-11-29 05:23:59 PST [,,] INFO  [DynamicScalingExecutor] org.apache.gobblin.temporal.yarn.DynamicScalingYarnService  - Requesting 2 new containers for profile firstProfile having currently 3 containers
2024-11-29 05:23:59 PST [,,] INFO  [DynamicScalingExecutor] org.apache.gobblin.temporal.yarn.YarnService  - Requesting 2 containers with resource=<memory:2048, vCores:2> and allocation request id = Optional.of(4)

Commits

✔️ My commits all reference JIRA issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "How to write a good git commit message":
1. Subject is separated from body by a blank line
2. Subject is limited to 50 characters
3. Subject does not end with a period
4. Subject uses the imperative mood ("add", not "adding")
5. Body wraps at 72 characters
6. Body explains "what" and "why", not "how"

@Blazer-007
Copy link
Contributor Author

Previous PR review done here - phet#1

Also please ignore initial commit history as those got added while rebasing

@Blazer-007 Blazer-007 changed the title [GOBBLIN-] GoT YarnService Integration with DynamicScaling [GOBBLIN-2174] GoT YarnService Integration with DynamicScaling Nov 23, 2024
Copy link
Contributor

@phet phet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks really good. are there any mocks that would help test YarnService?

Comment on lines 85 to 87
// TODO: remove this line later
// Using for testing purposes only
ScalingDirectiveSource scalingDirectiveSource = new DummyScalingDirectiveSource();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would it be helpful for unit testing if, rather than hard-coding, this class took the ScalingDirectiveSource FQ class name? I see that could be harder based on the ctor params.

As a simpler alternative, make DynamicScalingYarnServiceManger abstract w/ a method

abstract protected ScalingDirectiveSource createScalingDirectiveSource();

and then the concrete FsSourceDynamicScalingYarnServiceManager would hard code the ScalingDirectiveSource class. you could have a different concrete DSYSM using DummyScalingDirectiveSource. one of those such FQ class names would be a param.

... which reminds me.... how is this DSYSM created and initialized at present?

Copy link
Contributor Author

@Blazer-007 Blazer-007 Nov 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am using DummyScalingDirectiveSource(); to launch containers at runtime if i run any job to test complete e2e.

... which reminds me.... how is this DSYSM created and initialized at present?

here after starting yarnservice - https://github.com/apache/gobblin/blob/master/gobblin-temporal/src/main/java/org/apache/gobblin/temporal/yarn/GobblinTemporalApplicationMaster.java#L102 we initialize other service classes whose names are passed through config

public static final String APP_MASTER_SERVICE_CLASSES = GOBBLIN_YARN_PREFIX + "app.master.serviceClasses";

Copy link
Contributor Author

@Blazer-007 Blazer-007 Nov 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with point of creating abstract class, let me add that in next commit.
Refactoted the code to use AbstractDSYSM and also added unit tests.

@Blazer-007 Blazer-007 marked this pull request as ready for review November 29, 2024 13:56
@Blazer-007
Copy link
Contributor Author

Blazer-007 commented Nov 29, 2024

are there any mocks that would help test YarnService?

Have added unit test for buildContainerCommand

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants