Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AIRFLOW-5501] in_cluster default value in KubernetesPodOperator overwrites configuration #6124

Merged
merged 5 commits into from
Jan 16, 2020

Conversation

qlemaire22
Copy link

@qlemaire22 qlemaire22 commented Sep 16, 2019

Make sure you have checked all steps below.

Jira

Description

  • Here are some details about my PR, including screenshots of any UI changes:

Hi!

The default value of the parameter in_cluster of the kube_client.get_kube_client function is in_cluster=conf.getboolean('kubernetes', 'in_cluster'). Therefore, the expected behavior is that when, in_cluster is not set, it takes the value in the configuration file.

However, the default value of in_cluster in KubernetesPodOperator.py is False and in_cluster is passed as a parameter when calling the kube_client.get_kube_client function. Therefore, it changes the expecting behavior by overwritting the default value. When in_cluster is not set when initializing KubernetesPodOperator, the value of in_cluster in kube_client.get_kube_client is False and not the value which is in the configuration file.

It is quite confusing because it can feel like the value in the configuration file is not working properly.

Therefore, I changed the default value of the in_cluster parameter in KubernetesPodOperator so that it takes the value of the configuration file as a default and expected value instead of False as it is now.

Tests

  • My PR adds the following unit tests OR does not need testing for this extremely good reason:

Commits

  • My commits all reference Jira issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "How to write a good git commit message":
    1. Subject is separated from body by a blank line
    2. Subject is limited to 50 characters (not including Jira issue reference)
    3. Subject does not end with a period
    4. Subject uses the imperative mood ("add", not "adding")
    5. Body wraps at 72 characters
    6. Body explains "what" and "why", not "how"

Documentation

  • In case of new functionality, my PR adds documentation that describes how to use it.
    • All the public functions and the classes in the PR contain docstrings that explain what it does
    • If you implement backwards incompatible changes, please leave a note in the Updating.md so we can assign it to a appropriate release

@qlemaire22
Copy link
Author

Fix committer.

@mik-laj
Copy link
Member

mik-laj commented Sep 21, 2019

Travis is sad. Can you do rebase?

@qlemaire22 qlemaire22 force-pushed the in_cluster_param branch 2 times, most recently from 58b9c63 to 1816f11 Compare September 23, 2019 08:51
The default value of the parameter in_cluster of the
kube_client.get_kube_client function is
in_cluster=conf.getboolean('kubernetes', 'in_cluster'). Therefore, the
expected behavior is that when, in_cluster is not set, it takes the
value in the configuration file.

However, the default value of in_cluster in KubernetesPodOperator.py is
False and in_cluster is passed as a parameter when calling the
kube_client.get_kube_client function. Therefore, it changes the
expecting behavior by overwritting the default value. When in_cluster is
not set when initializing KubernetesPodOperator, the value of in_cluster
in kube_client.get_kube_client is False and not the value which is in
the configuration file.

Therefore, the default value of in_cluster in KubernetesPodOperator has
been changed to None and will not be passed to get_kube_client if it is
not overwritten so that it takes the configuration value as a default
value.
@qlemaire22
Copy link
Author

qlemaire22 commented Sep 23, 2019

I did the rebase, but there are still some errors and Travis is still sad. After some investigation, I realised that my changes were modifying the behavior of some tests.
For example this test in tests/minikube/test_kubernetes_pod_operator.py:

def test_working_pod(self):
        k = KubernetesPodOperator(
            namespace='default',
            image="ubuntu:16.04",
            cmds=["bash", "-cx"],
            arguments=["echo 10"],
            labels={"foo": "bar"},
            name="test",
            task_id="task"
        )
        k.execute(None)
        actual_pod = self.api_client.sanitize_for_serialization(k.pod)
        self.assertEqual(self.expected_pod, actual_pod)

It does not specify the in_cluster argument. Therefore, the old behavior was to use in_cluster=False which was the default value. Now, it is using the value defined in the config which is in_cluster=True (link to the config).

Therefore, the test is now different and it is the case for every failing test.

Shall I modify the failing tests to include the in_cluster=False argument to come back to the old behavior or shall I try to solve the Service host/port is not set error which seems to be caused by some missing environment variables?

@stale
Copy link

stale bot commented Nov 20, 2019

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale Stale PRs per the .github/workflows/stale.yml policy file label Nov 20, 2019
@stale stale bot removed the stale Stale PRs per the .github/workflows/stale.yml policy file label Nov 21, 2019
@codecov-io
Copy link

codecov-io commented Nov 21, 2019

Codecov Report

Merging #6124 into master will decrease coverage by 0.32%.
The diff coverage is 0%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #6124      +/-   ##
==========================================
- Coverage    83.8%   83.48%   -0.33%     
==========================================
  Files         669      669              
  Lines       37564    37566       +2     
==========================================
- Hits        31480    31361     -119     
- Misses       6084     6205     +121
Impacted Files Coverage Δ
...rflow/contrib/operators/kubernetes_pod_operator.py 75% <0%> (-23.58%) ⬇️
airflow/kubernetes/volume_mount.py 44.44% <0%> (-55.56%) ⬇️
airflow/kubernetes/volume.py 52.94% <0%> (-47.06%) ⬇️
airflow/kubernetes/pod_launcher.py 45.25% <0%> (-46.72%) ⬇️
airflow/kubernetes/refresh_config.py 50.98% <0%> (-23.53%) ⬇️
airflow/configuration.py 89.13% <0%> (-3.63%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update fab957e...8043c49. Read the comment docs.

@qlemaire22
Copy link
Author

The tests are successful now.

@stale
Copy link

stale bot commented Jan 5, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale Stale PRs per the .github/workflows/stale.yml policy file label Jan 5, 2020
@stale stale bot closed this Jan 12, 2020
@RosterIn
Copy link
Contributor

@dimberman @qlemaire22 I think this fix is simple and important.
why not to merge this?

@ashb ashb reopened this Jan 14, 2020
@stale stale bot removed the stale Stale PRs per the .github/workflows/stale.yml policy file label Jan 14, 2020
@ashb
Copy link
Member

ashb commented Jan 14, 2020

@RosterIn because it got forgotten about.

Ive tried resolving the conflicts. Hopefully the test will still pass.

@RosterIn
Copy link
Contributor

seems there is an issue @qlemaire22
error: Incompatible default for argument "in_cluster" (default has type "None", argument has type "bool"

To my understanding None is not compatible with non-optional types by default however it can be overwritten by --no-strict-optional setting

@ashb ashb merged commit e54fba5 into apache:master Jan 16, 2020
kaxil added a commit that referenced this pull request Jan 23, 2020
potiuk pushed a commit that referenced this pull request Jan 26, 2020
kaxil added a commit that referenced this pull request Jan 26, 2020
kaxil added a commit that referenced this pull request Feb 3, 2020
galuszkak pushed a commit to FlyrInc/apache-airflow that referenced this pull request Mar 5, 2020
…or respect config (apache#6124)

The default value of the parameter in_cluster of the
kube_client.get_kube_client function is
in_cluster=conf.getboolean('kubernetes', 'in_cluster'). Therefore, the
expected behavior is that when, in_cluster is not set, it takes the
value in the configuration file.

However, the default value of in_cluster in KubernetesPodOperator.py is
False and in_cluster is passed as a parameter when calling the
kube_client.get_kube_client function. Therefore, it changes the
expecting behavior by overwritting the default value. When in_cluster is
not set when initializing KubernetesPodOperator, the value of in_cluster
in kube_client.get_kube_client is False and not the value which is in
the configuration file.

Therefore, the default value of in_cluster in KubernetesPodOperator has
been changed to None and will not be passed to get_kube_client if it is
not overwritten so that it takes the configuration value as a default
value.

Co-authored-by: Ash Berlin-Taylor <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants