Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PermissionDenied : Unable to use docker-compose due to UID conflicts #17320

Closed
ImadYIdrissi opened this issue Jul 29, 2021 · 12 comments
Closed
Labels
invalid kind:bug This is a clearly a bug

Comments

@ImadYIdrissi
Copy link

Apache Airflow version: 2.1.0

Environment:

  • Cloud provider or hardware configuration: GCP Compute Engine - e2-standard-4 (4 vCPUs, 16 GB memory)
  • OS (e.g. from /etc/os-release): Ubuntu 18.04
  • Kernel (e.g. uname -a): Linux lamachine-preprod 5.4.0-1049-gcp #53~18.04.1-Ubuntu SMP Thu Jul 15 11:32:10 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

What happened:
When trying to run $ sudo docker-compose run airflow-init bash

Creating be-api_airflow-init_run ... done
....................
ERROR! Maximum number of retries (20) reached.

Last check result:
$ airflow db check
Traceback (most recent call last):
  File "/home/airflow/.local/bin/airflow", line 5, in <module>
    from airflow.__main__ import main
  File "/home/airflow/.local/lib/python3.6/site-packages/airflow/__init__.py", line 34, in <module>
    from airflow import settings
  File "/home/airflow/.local/lib/python3.6/site-packages/airflow/settings.py", line 35, in <module>
    from airflow.configuration import AIRFLOW_HOME, WEBSERVER_CONFIG, conf  # NOQA F401
  File "/home/airflow/.local/lib/python3.6/site-packages/airflow/configuration.py", line 1115, in <module>
    conf = initialize_config()
  File "/home/airflow/.local/lib/python3.6/site-packages/airflow/configuration.py", line 836, in initialize_config
    with open(AIRFLOW_CONFIG, 'w') as file:
PermissionError: [Errno 13] Permission denied: '/home/airflow/airflow.cfg'
ERROR: 1

What you expected to happen:

I expected to see a correct initialization of the container with the proper file permissions for the specified UID in the docker-compose.yml file, with an output that resembles this :

Creating be-api_airflow-init_run ... done
BACKEND=postgresql+psycopg2
DB_HOST=postgres
DB_PORT=5432

DB: postgresql+psycopg2://airflow:***@postgres/airflow
[2021-07-29 16:25:03,687] {db.py:695} INFO - Creating tables
INFO  [alembic.runtime.migration] Context impl PostgresqlImpl.
INFO  [alembic.runtime.migration] Will assume transactional DDL.
Upgrades done
airflow already exist in the db
airflow@7a15c956e187:/opt/airflow$ cd /home/airflow/
airflow@7a15c956e187:~$

P.S : This output is achieved by using the UID=50000 within the .env file that is attached to the docker-compose.yml file.

When using a different UID (i.e 1001 in my case), in order to match the file permissions for the ./dags, ./logs, ./plugins, the error occurs. I think the UID=50000 was enforced at some point in the DockerFile of the Airflow image, and is not correctly substituted when docker-compose.yml tries to change this value, so the /home/airflow files are still created with owner as UID:50000 while the sub-directories ./dags, ./logs, ./plugins will have the UID/GID of the host system.

There are 2 major issues with the approach of using a fixed UID:

  1. If I have to create and use a single UID=50000 that will handle all airflow operations, then my airflow file system within the host cannot be operated properly with different users, e.g. devs when pulling new changes from git...
  2. Even if this works properly and we can use another UID than 50000, it still restricts the actions to a singular user, that is binded with the GID=0 (This is a requirement from airflow). The result is that we will have the same limitation as mentionned earlier, i.e. only 1 UID will be able to change the host file system. (Maybe I need to create a separate issue for this)

How to reproduce it:
Create a project with the following structure

custom-project
 ┣ src
 ┃ ┣ dags
 ┃ ┃ ┗ hello_geeks.py
 ┃ ┣ logs
 ┃ ┗ plugins
 ┣ .env
 ┣ README.md
 ┗ docker-compose.yml

Use the following files with the following command sudo docker-compose run airflow-init bash

docker-compose.yml file :

# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements.  See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership.  The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License.  You may obtain a copy of the License at
#
#   http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied.  See the License for the
# specific language governing permissions and limitations
# under the License.
#

# Basic Airflow cluster configuration for CeleryExecutor with Redis and PostgreSQL.
#
# WARNING: This configuration is for local development. Do not use it in a production deployment.
#
# This configuration supports basic configuration using environment variables or an .env file
# The following variables are supported:
#
# AIRFLOW_IMAGE_NAME           - Docker image name used to run Airflow.
#                                Default: apache/airflow:|version|
# AIRFLOW_UID                  - User ID in Airflow containers
#                                Default: 50000
# AIRFLOW_GID                  - Group ID in Airflow containers
#                                Default: 50000
#
# Those configurations are useful mostly in case of standalone testing/running Airflow in test/try-out mode
#
# _AIRFLOW_WWW_USER_USERNAME   - Username for the administrator account (if requested).
#                                Default: airflow
# _AIRFLOW_WWW_USER_PASSWORD   - Password for the administrator account (if requested).
#                                Default: airflow
# _PIP_ADDITIONAL_REQUIREMENTS - Additional PIP requirements to add when starting all containers.
#                                Default: ''
#
# Feel free to modify this file to suit your needs.
---
    version: '3'
    x-airflow-common:
      &airflow-common
      image: ${AIRFLOW_IMAGE_NAME:-apache/airflow:2.1.0}
      environment:
        &airflow-common-env
        AIRFLOW__CORE__EXECUTOR: CeleryExecutor
        AIRFLOW__CORE__SQL_ALCHEMY_CONN: postgresql+psycopg2://airflow:airflow@postgres/airflow
        AIRFLOW__CELERY__RESULT_BACKEND: db+postgresql://airflow:airflow@postgres/airflow
        AIRFLOW__CELERY__BROKER_URL: redis://:@redis:6379/0
        AIRFLOW__CORE__FERNET_KEY: ''
        AIRFLOW__CORE__DAGS_ARE_PAUSED_AT_CREATION: 'true'
        AIRFLOW__CORE__LOAD_EXAMPLES: 'false'
        AIRFLOW__API__AUTH_BACKEND: 'airflow.api.auth.backend.basic_auth'
        AIRFLOW_HOME: '${AIRFLOW_HOME:-/opt/airflow}'
        _PIP_ADDITIONAL_REQUIREMENTS: ${_PIP_ADDITIONAL_REQUIREMENTS:-}
      volumes:
        - ./src/dags:${AIRFLOW_HOME:-/opt/airflow}/dags
        - ./src/logs:${AIRFLOW_HOME:-/opt/airflow}/logs
        - ./src/plugins:${AIRFLOW_HOME:-/opt/airflow}/plugins
      user: "${AIRFLOW_UID:-50000}:${AIRFLOW_GID:-50000}"
      depends_on:
        &airflow-common-depends-on
        redis:
          condition: service_healthy
        postgres:
          condition: service_healthy
    
    services:
      postgres:
        image: postgres:13
        environment:
          POSTGRES_USER: airflow
          POSTGRES_PASSWORD: airflow
          POSTGRES_DB: airflow
        volumes:
          - postgres-db-volume:/var/lib/postgresql/data
        healthcheck:
          test: ["CMD", "pg_isready", "-U", "airflow"]
          interval: 5s
          retries: 5
        restart: always
    
      redis:
        image: redis:latest
        expose:
          - 6379
        healthcheck:
          test: ["CMD", "redis-cli", "ping"]
          interval: 5s
          timeout: 30s
          retries: 50
        restart: always
    
      airflow-webserver:
        <<: *airflow-common
        command: webserver
        ports:
          - 9999:8080
        healthcheck:
          test: ["CMD", "curl", "--fail", "http://localhost:8080/health"]
          interval: 10s
          timeout: 10s
          retries: 5
        restart: always
        depends_on:
          <<: *airflow-common-depends-on
          airflow-init:
            condition: service_completed_successfully
    
      airflow-scheduler:
        <<: *airflow-common
        command: scheduler
        healthcheck:
          test: ["CMD-SHELL", 'airflow jobs check --job-type SchedulerJob --hostname "$${HOSTNAME}"']
          interval: 10s
          timeout: 10s
          retries: 5
        restart: always
        depends_on:
          <<: *airflow-common-depends-on
          airflow-init:
            condition: service_completed_successfully
    
      airflow-worker:
        <<: *airflow-common
        command: celery worker
        healthcheck:
          test:
            - "CMD-SHELL"
            - 'celery --app airflow.executors.celery_executor.app inspect ping -d "celery@$${HOSTNAME}"'
          interval: 10s
          timeout: 10s
          retries: 5
        restart: always
        depends_on:
          <<: *airflow-common-depends-on
          airflow-init:
            condition: service_completed_successfully
    
      airflow-init:
        <<: *airflow-common
        command: version
        environment:
          <<: *airflow-common-env
          _AIRFLOW_DB_UPGRADE: 'true'
          _AIRFLOW_WWW_USER_CREATE: 'true'
          _AIRFLOW_WWW_USER_USERNAME: ${_AIRFLOW_WWW_USER_USERNAME:-airflow}
          _AIRFLOW_WWW_USER_PASSWORD: ${_AIRFLOW_WWW_USER_PASSWORD:-airflow}
    
      airflow-cli:
        <<: *airflow-common
        profiles:
          - debug
        environment:
          <<: *airflow-common-env
          CONNECTION_CHECK_MAX_COUNT: "0"
        # Workaround for entrypoint issue. See: https://github.com/apache/airflow/issues/16252
        command:
          - bash
          - -c
          - airflow
    
      flower:
        <<: *airflow-common
        command: celery flower
        ports:
          - 5555:5555
        healthcheck:
          test: ["CMD", "curl", "--fail", "http://localhost:5555/"]
          interval: 10s
          timeout: 10s
          retries: 5
        restart: always
        depends_on:
          <<: *airflow-common-depends-on
          airflow-init:
            condition: service_completed_successfully
    
    volumes:
      postgres-db-volume:

.env file :

AIRFLOW_UID=1001
AIRFLOW_GID=0
AIRFLOW_HOME=/home/airflow
@ImadYIdrissi ImadYIdrissi added the kind:bug This is a clearly a bug label Jul 29, 2021
@ImadYIdrissi
Copy link
Author

I did find this thread that deals with this issue, but I believe it should be incorporated to the public community image directly, instead of forking with a custom image. I.e. This custom image, by puckle, which solves this issue is for airflow 1.10.9.

@potiuk
Copy link
Member

potiuk commented Jul 29, 2021

  1. Docker-Compose of Airlow is not production ready. You should use it only for development and testing, if you want more production setup I recommend to use the official Helm Chart of the communiy: https://airflow.apache.org/docs/helm-
    chart/stable/index.html and K8S.

  2. If you mount your local folders as volumes to Airlfow, you should make sure that you use your HOST_ID as user_id and GID=0 as user. This is very clear in the "Initialize the environment section of the docker compose" https://airflow.apache.org/docs/apache-airflow/stable/start/docker.html#initializing-environment quick start documentation. Apparently you mised that step. so let me copy it here (you need to run it once in the host in the directory where you have docker compose.

echo -e "AIRFLOW_UID=$(id -u)\nAIRFLOW_GID=0" > .env

How it works - it changes the user that is used by Airflow to be the same as your host user, and sets the group ID to be 0 (which is the best practice followed from OpenShift to make possible to run container image as an arbitrary user).

  1. No other configuration is supported for docker compose when you mount your local folder on linux. This is is a Docker limitation, not airflow nor image limitation. Airflow image is defined according to best practices of OpenShift and allows to run as arbitrary user id (, but when you mount local volume from host, the user from your host volume is an owner. There are various ways YOU can deal with the problem when you run the container (one of the solutions is the very one that Airlfow proposes - where you can define and use HOST UID and GID=0 to run the image. You can read more here: https://airflow.apache.org/docs/docker-stack/entrypoint.html#allowing-arbitrary-user-to-run-the-container

  2. The solution you copied is to manually build image and there manually your UID as the user. Which (obviously) cannot be done in the public image, because we do not know your user id when we prepare the image and each user has a different UID. It's just impossible.

@potiuk potiuk closed this as completed Jul 29, 2021
@potiuk potiuk added the invalid label Jul 29, 2021
@potiuk
Copy link
Member

potiuk commented Jul 29, 2021

BTW. I think you are not aware that your user changes when you run sudo. DO NOT use sudo when you run docker compose or docker commands, because then they will be used as "root" user (UID=0) which is likely the root cause of your problem. Most likely your logs/dags/ etc. file have been created as owned by that root user and this is causing all your permission problem. Make sure you do what docker installation suggests https://docs.docker.com/engine/install/linux-postinstall/ - add your user to docker group, so you do not have to use sudo to run docker command. Then check what permissions/ownership you have for the dags/ logs/ etc. folders (and files inside). Change them to be owned by your user rather than root.

And then follow the steps from the quick start.

@ImadYIdrissi
Copy link
Author

ImadYIdrissi commented Jul 29, 2021

I don't understand why this thread was closed. The causes of this issue are still not clearly identified, or confirmed.

Docker-Compose of Airlow is not production ready. You should use it only for development and testing

I am not trying to use it for production, we are merely testing this approach, and trying to understand it.

you should make sure that you use your HOST_ID as user_id and GID=0 as user.

I thought I did set AIRFLOW_UID to 1001 and AIRFLOW_GID to 0 (manually) in the .env file.
Is it not the same as doing echo -e "AIRFLOW_UID=$(id -u)\nAIRFLOW_GID=0" > .env ?

BTW. I think you are not aware that your user changes when you run sudo... Most likely your logs/dags/ etc. file have been created as owned by that root user and this is causing all your permission problem.

My dags, logs etc... are not root owned, they're user owned (UID : 1001). I hope I am not badly mistaken about this, but if so, could you kindly clarify the misconception ?
image
image

The solution you copied is to manually build image and there manually your UID as the user.

Could you clarify what you mean by manually? Is it because I put it in the .env myself? As I have asked earlier, wouldn't the above mentionned command do just that?

I will follow your recommendation about adding the user to the docker group to avoid using sudo. Thank you.

P.S : Please find the .env file content under the docker-compose.yml which was enclosed in the initial post, but might have been easily missed as the main content is bigger.

@potiuk
Copy link
Member

potiuk commented Jul 30, 2021

The reason it was closed, because you indicated that you expect that things will work out-of-the-box even when you indicated yourself you tried to change users (first 50000 and then 1001) and that it has changed since previous time, which indicated that you expected more from the docker compose than what it is intended to.

The "quick start" is really to get you quick-started and if you want to change anything (like change the user and experiment with the setting), there is not much we can do to "solve" the issue you raised as bug. From your message it seems that you had a history of using this setup and that you "expect" it to work under this different circumstances (one of the problems was that you used sudo to run the docker compose which is guaranteed not to work).

You also indicated "If I have to create and use a single UID=50000 that will handle all airflow operations, then my airflow file system within the host cannot be operated properly with different users, e.g. devs when pulling new changes from git..." which means that you wanted to make the docker compose works for many users, but this is definitely not the intention of our docker compose. It is there to make single user to be able to quick-start and run airflow on that user's machine. That's it. There are other ways to make airflow works for multiple users for development environment, but the quick start docker compose is not one of them. That was the "production" use i was referring to, which was indeed a bit to narrow, "multi-user" would be more appropriate. It's not designed to be used by "multiple users". So it is not a bug that it does not work this way.

The Quick-start documentation is just that - quick-start. No more, no less. You have to strictly follow it to get it working, if you do any deviation from that, it might or might not work. When you open an issue indicating "bug" I think you expect it to be fixed. There is no action anyone can take here to "fix" the problems you were experiencing. That's why the issue was closed - because there is no reasonable action anyone can take here to fix the problems you experienced.

The best thing You can do is to wipe out all your setup and start from scratch and precisely follow the quick-start with no deviations and see if it works. If it does not, please report it here with all information that you can and I will be happy to reopen the issue (if it willl indicate that our quick start instructions are wrong. Or even better - open a PR fixing it straight away.

If you want to discuss the usage and results of your experiments you can use "GitHub discussions" or discuss it in slack. But opening a "bug" in this case is clearly a mistake resulting from misuse and not following the instructions, rather than bug in airflow. This is precisely why it was marked as "invalid" and "closed".

And if you want to propose new feature, change how docker compose works - feel free. We actually have an open feature to make a more "versatile" docker-compose setup with more examples and possibly wizard-like generation #16031 It would be great if you could contribute to that

@potiuk
Copy link
Member

potiuk commented Jul 30, 2021

Could you clarify what you mean by manually? Is it because I put it in the .env myself? As I have asked earlier, wouldn't the above mentionned command do just that?

Manually means added to Dockerfile and specify it verbatim at the time of docker building. This is at least one of the solutions in the thread that you mentioned (you did not tell which one it was, so I picked the one that concluded the thread).

@potiuk
Copy link
Member

potiuk commented Jul 30, 2021

I thought I did set AIRFLOW_UID to 1001 and AIRFLOW_GID to 0 (manually) in the .env file.
Is it not the same as doing echo -e "AIRFLOW_UID=$(id -u)\nAIRFLOW_GID=0" > .env ?

This looks good. As mentioned above - I recommend you to wipe it out, restart and see if you still have problems. In your earlier comments you mentioned that previously you used different USER and you used sudo to run docker command (which is guaranteed not to work because then your container are run as root rather than your own user. So I guess (as I wrote) that you missed the .env setup and run it without it as sudo user (which would create those directory as root user if you did). That was my line of thoughts.

@MingTaLee
Copy link

MingTaLee commented May 29, 2023

I got similar issue with @ImadYIdrissi .

I'm setting up an Airflow develop / test environment with the Airflow 2.5.3 docker image and docker-compose.yaml file from Apache Airflow official website.

The server startng / running the docker-compose is an AWS EC2 server running Ubuntu 20.04.6 LTS.
The user to run docker-compose up airflow-init and docker-compose up has ID 1002, and dags / logs / plugins are owned by that user in the host side.
(I did not use SUDO when running the docker-compose command)

Already set AIRFLOW_UID=1002 in .env file as suggested by @potiuk .

Checking inside airflow-airlfow-worker-1 container after the containers initiated and ready, I can see 3 users:
root / airflow / default
User "airflow" has UID 50000 and User "default" has UID 1002, but it is a nologin user and its home is set as user airflow (/home/airflow).

If I change the line [ user: "${AIRFLOW_UID:-50000}:0" ] to [ user: "${AIRFLOW_UID:-1002}:0" ] in docker-compose.yaml, there will be no user account name "default", however, inside the containers, the account "airflow" still use ID 50000 and dags / logs / plugins still have owner as 1002:root.

I have scraped the containers and restart with docker-compose down / up several times, with different settings and none gave me proper ownership of the volume bind paths (i.e. dags/ logs / plugins).

Is there any other settings that I missed?

Any input is appreciated! Thanks!

@potiuk
Copy link
Member

potiuk commented May 29, 2023

What exact stack track do you have? You mentioned "similar" but you forgot to attach the error or stack trace you have (which - unless you have also some modificaiton in your image or variables you pass to it.

I believe the problem might be, that you also override AIRFLOW_HOME variable or something similar.

By default when you set the AIRFLOW_UID variable, the following things are happening:

  1. The new user is created and what you observe is correct: It should have home set to /airflow/home, it should have UID=1002 and GID=0. So far so good.

This is what I have when I enter the image with UID=1002. This is exactly as expected:

default@aa130926bd39:/opt/airflow$ cat /etc/passwd | grep default
default:x:1002:0:default user:/home/airflow:/sbin/nologin
  1. The AIRFLOW_HOME variable should be set to /opt/airflow and the default airflow.cfg should be already created there:
default@aa130926bd39:/opt/airflow$ set |grep AIRFLOW_HOME
AIRFLOW_HOME=/opt/airflow
default@aa130926bd39:/opt/airflow$ ls ${AIRFLOW_HOME}
airflow.cfg  airflow.db  dags  logs  webserver_config.py
  1. I can run airflow config list and it works all right - showing the values from airflow.cfg. Because default user belongs to 0 group and the /opt/airflow folder is owned by the 0 group and the group has all the permissions there. I can even remove the airflow.cfg and run airlfow help or another command at it will get recreated:
default@aa130926bd39:/opt/airflow$ airflow config list
[core]
dags_folder = /opt/airflow/dags
hostname_callable = airflow.utils.net.getfqdn

(removed for brevity)

default@aa130926bd39:/opt/airflow$ ls
airflow.cfg  airflow.db  dags  logs  webserver_config.py
default@aa130926bd39:/opt/airflow$ rm airflow.cfg
default@aa130926bd39:/opt/airflow$ ls
airflow.db  dags  logs  webserver_config.py
default@aa130926bd39:/opt/airflow$ airflow help
usage: airflow [-h] GROUP_OR_COMMAND ...

(removed for brevity)

default@aa130926bd39:/opt/airflow$ ls
airflow.cfg  airflow.db  dags  logs  webserver_config.py
default@aa130926bd39:/opt/airflow$ ls -la
total 84
drwxrwxr-x 1 airflow root  4096 May 29 09:33 .
drwxr-xr-x 1 root    root  4096 Mar 31 22:55 ..
-rw------- 1 default root 51721 May 29 09:33 airflow.cfg
-rw-r--r-- 1 default root     0 May 29 09:24 airflow.db
drwxrwxr-x 2 airflow root  4096 Mar 31 22:55 dags
drwxrwxr-x 1 airflow root  4096 May 29 09:24 logs
-rw-rw-r-- 1 default root  4771 May 29 09:24 webserver_config.py
default@aa130926bd39:/opt/airflow$

So I wonder - what's your error @MingTaLee and what the above commands show?

@MingTaLee
Copy link

MingTaLee commented May 30, 2023

@potiuk
Thank yo very much for your reply, really appreciated!

Our purpose is to set up a development and testing enviromnemt for my colleagues to test their dags. And we would like to have the dags folder bind to another directory from the server side using docker volume settings. My colleagues will use the account "airflow" to SSH login into the worker container and modify dags, git push to repository and test it.

However, since the volume bind into the container now owned by the user named "default", user "airflow" does not have permission to do the works (modification and git). And if I modify the owner or permission inside the container, I will mess up those in the server side....

Below is what I did / not did:

  1. I didn't use docker swarm to manage the service. Simply docker-compose up airflow-init and then docker-compose up to start.. So unfortunately I don't have stack trace to provide here (or do you mean there is something else called stack track? If so, please help me with how I can find it!).

  2. Another important issue I forgot to mention earlier is, in my first few tests to start the service with docker-compose up (with version 2.5.3 docker image from Airflow directly, no modification at this stage), the tests failed. After investigation, I found that the "webserver_config.py" file is not created properly. Instead of a file, there is an empty folder with that name created. And no airflow.cfg file to be found. I have to manually provide airflow.cfg and webserver_config.py I copied from my previous test (back when testing Airflow 1.10.12). NOT SURE WHETHER SOMETHING MESSED UP IN THIS STEP.... I simply assumed that since the service started, those settings should be OK....

  3. Later I did modify the image a bit to add SSH and PyMySQL into the image. Here is the modified dockerfile:

FROM apache/airflow:2.5.3-python3.8

LABEL description="Modify from Airflow 2.5.3 image by Apache.  Add openssh-server / PyMySQL.  New_name: airflow253:v2.01"  version="2.01"

RUN export DEBIAN_FRONTEND=noninteractive \
    && python3 -m pip install --no-cache-dir --upgrade pip && python3 -m pip install --upgrade setuptools \
    && python3 -m pip install --no-cache-dir pymysql

USER root

RUN export DEBIAN_FRONTEND=noninteractive \
    && apt-get update && apt-get -y upgrade \
    && apt-get install -y openssh-server git \
    && apt-get purge && apt-get clean && apt-get autoclean && apt-get remove && apt-get -y autoremove \
    && rm -Rf /root/.cache/pip \
    && rm -rf /var/lib/apt/lists/*

CMD ["/bin/bash"]

Although I don't think these modifications will change the variables you mentioned, but I may be wrong.

Below are the output of those commands you mentioned:

default@51d57fa9c7f2:/opt/airflow$ cat /etc/passwd | grep default
default:x:1002:0:default user:/home/airflow:/sbin/nologin
default@51d57fa9c7f2:/opt/airflow$ set |grep AIRFLOW_HOME
AIRFLOW_HOME=/opt/airflow
default@51d57fa9c7f2:/opt/airflow$ ls ${AIRFLOW_HOME}
airflow-worker.pid  airflow.cfg  dags  logs  plugins  webserver_config.py
default@51d57fa9c7f2:/opt/airflow$ airflow config list
[core]
dags_folder = /opt/airflow/dags
hostname_callable = airflow.utils.net.getfqdn
default_timezone = utc
executor = CeleryExecutor
parallelism = 32
max_active_tasks_per_dag = 16
dags_are_paused_at_creation = True
max_active_runs_per_dag = 16
load_examples = True
plugins_folder = /opt/airflow/plugins
execute_tasks_new_python_interpreter = False
fernet_key =
donot_pickle = True
dagbag_import_timeout = 30.0
dagbag_import_error_tracebacks = True
dagbag_import_error_traceback_depth = 2
dag_file_processor_timeout = 50
task_runner = StandardTaskRunner
default_impersonation =
security =
unit_test_mode = False
enable_xcom_pickling = False
allowed_deserialization_classes = airflow\..*
killed_task_cleanup_time = 60
dag_run_conf_overrides_params = True
dag_discovery_safe_mode = True
dag_ignore_file_syntax = regexp
default_task_retries = 0
default_task_retry_delay = 300
default_task_weight_rule = downstream
default_task_execution_timeout =
min_serialized_dag_update_interval = 30
compress_serialized_dags = False
min_serialized_dag_fetch_interval = 10
max_num_rendered_ti_fields_per_task = 30
check_slas = True
xcom_backend = airflow.models.xcom.BaseXCom
lazy_load_plugins = True
lazy_discover_providers = True
hide_sensitive_var_conn_fields = True
sensitive_var_conn_names =
default_pool_task_slot_count = 128
max_map_length = 1024
daemon_umask = 0o077
sql_alchemy_conn = postgresql+psycopg2://airflow:airflow@postgres/airflow

[database]
sql_alchemy_conn = postgresql+psycopg2://airflow:airflow@postgres/airflow
sql_engine_encoding = utf-8
sql_alchemy_pool_enabled = True
sql_alchemy_pool_size = 5
sql_alchemy_max_overflow = 10
sql_alchemy_pool_recycle = 1800
sql_alchemy_pool_pre_ping = True
sql_alchemy_schema =
load_default_connections = True
max_db_retries = 3

[logging]
base_log_folder = /opt/airflow/logs
remote_logging = False
remote_log_conn_id =
google_key_path =
remote_base_log_folder =
encrypt_s3_logs = False
logging_level = INFO
celery_logging_level =
fab_logging_level = WARNING
logging_config_class =
colored_console_log = True
colored_log_format = [%(blue)s%(asctime)s%(reset)s] {%(blue)s%(filename)s:%(reset)s%(lineno)d} %(log_color)s%(levelname)s%(reset)s - %(log_color)s%(message)s%(reset)s
colored_formatter_class = airflow.utils.log.colored_log.CustomTTYColoredFormatter
log_format = [%(asctime)s] {%(filename)s:%(lineno)d} %(levelname)s - %(message)s
simple_log_format = %(asctime)s %(levelname)s - %(message)s
dag_processor_log_target = file
dag_processor_log_format = [%(asctime)s] [SOURCE:DAG_PROCESSOR] {%(filename)s:%(lineno)d} %(levelname)s - %(message)s
log_formatter_class = airflow.utils.log.timezone_aware.TimezoneAware
task_log_prefix_template =
log_filename_template = dag_id={{ ti.dag_id }}/run_id={{ ti.run_id }}/task_id={{ ti.task_id }}/{% if ti.map_index >= 0 %}map_index={{ ti.map_index }}/{% endif %}attempt={{ try_number }}.log
log_processor_filename_template = {{ filename }}.log
dag_processor_manager_log_location = /opt/airflow/logs/dag_processor_manager/dag_processor_manager.log
task_log_reader = task
extra_logger_names =
worker_log_server_port = 8793

[metrics]
statsd_on = False
statsd_host = localhost
statsd_port = 8125
statsd_prefix = airflow
statsd_allow_list =
stat_name_handler =
statsd_datadog_enabled = False
statsd_datadog_tags =

[secrets]
backend =
backend_kwargs =

[cli]
api_client = airflow.api.client.local_client
endpoint_url = http://localhost:8080

[debug]
fail_fast = False

[api]
enable_experimental_api = False
auth_backends = airflow.api.auth.backend.basic_auth,airflow.api.auth.backend.session
maximum_page_limit = 100
fallback_page_limit = 100
google_oauth2_audience =
google_key_path =
access_control_allow_headers =
access_control_allow_methods =
access_control_allow_origins =

[lineage]
backend =

[atlas]
sasl_enabled = False
host =
port = 21000
username =
password =

[operators]
default_owner = airflow
default_cpus = 1
default_ram = 512
default_disk = 512
default_gpus = 0
default_queue = default
allow_illegal_arguments = False

[hive]
default_hive_mapred_queue =

[webserver]
base_url = http://localhost:8080
default_ui_timezone = UTC
web_server_host = 0.0.0.0
web_server_port = 8080
web_server_ssl_cert =
web_server_ssl_key =
session_backend = database
web_server_master_timeout = 120
web_server_worker_timeout = 120
worker_refresh_batch_size = 1
worker_refresh_interval = 6000
reload_on_plugin_change = False
secret_key = BSlegi2JIGb8pADrl2RNYw==
workers = 4
worker_class = sync
access_logfile = -
error_logfile = -
access_logformat =
expose_config = False
expose_hostname = False
expose_stacktrace = False
dag_default_view = grid
dag_orientation = LR
log_fetch_timeout_sec = 5
log_fetch_delay_sec = 2
log_auto_tailing_offset = 30
log_animation_speed = 1000
hide_paused_dags_by_default = False
page_size = 100
navbar_color = #fff
default_dag_run_display_number = 25
enable_proxy_fix = False
proxy_fix_x_for = 1
proxy_fix_x_proto = 1
proxy_fix_x_host = 1
proxy_fix_x_port = 1
proxy_fix_x_prefix = 1
cookie_secure = False
cookie_samesite = Lax
default_wrap = False
x_frame_enabled = True
show_recent_stats_for_completed_runs = True
update_fab_perms = True
session_lifetime_minutes = 43200
instance_name_has_markup = False
auto_refresh_interval = 3
warn_deployment_exposure = True
audit_view_excluded_events = gantt,landing_times,tries,duration,calendar,graph,grid,tree,tree_data

[email]
email_backend = airflow.utils.email.send_email_smtp
email_conn_id = smtp_default
default_email_on_retry = True
default_email_on_failure = True

[smtp]
smtp_host = localhost
smtp_starttls = True
smtp_ssl = False
smtp_port = 25
smtp_mail_from = [email protected]
smtp_timeout = 30
smtp_retry_limit = 5

[sentry]
sentry_on = False
sentry_dsn =

[local_kubernetes_executor]
kubernetes_queue = kubernetes

[celery_kubernetes_executor]
kubernetes_queue = kubernetes

[celery]
celery_app_name = airflow.executors.celery_executor
worker_concurrency = 16
worker_prefetch_multiplier = 1
worker_enable_remote_control = True
broker_url = redis://:@redis:6379/0
flower_host = 0.0.0.0
flower_url_prefix =
flower_port = 5555
flower_basic_auth =
sync_parallelism = 0
celery_config_options = airflow.config_templates.default_celery.DEFAULT_CELERY_CONFIG
ssl_active = False
ssl_key =
ssl_cert =
ssl_cacert =
pool = prefork
operation_timeout = 1.0
task_track_started = True
task_adoption_timeout = 600
stalled_task_timeout = 0
task_publish_max_retries = 3
worker_precheck = False
result_backend = db+postgresql://airflow:airflow@postgres/airflow

[celery_broker_transport_options]

[dask]
cluster_address = 127.0.0.1:8786
tls_ca =
tls_cert =
tls_key =

[scheduler]
job_heartbeat_sec = 5
scheduler_heartbeat_sec = 5
num_runs = -1
scheduler_idle_sleep_time = 1
min_file_process_interval = 30
parsing_cleanup_interval = 60
dag_dir_list_interval = 300
print_stats_interval = 30
pool_metrics_interval = 5.0
scheduler_health_check_threshold = 30
enable_health_check = True
scheduler_health_check_server_port = 8974
orphaned_tasks_check_interval = 300.0
child_process_log_directory = /opt/airflow/logs/scheduler
scheduler_zombie_task_threshold = 300
zombie_detection_interval = 10.0
catchup_by_default = True
ignore_first_depends_on_past_by_default = True
max_tis_per_query = 512
use_row_level_locking = True
max_dagruns_to_create_per_loop = 10
max_dagruns_per_loop_to_schedule = 20
schedule_after_task_execution = True
parsing_processes = 2
file_parsing_sort_mode = modified_time
standalone_dag_processor = False
max_callbacks_per_loop = 20
dag_stale_not_seen_duration = 600
use_job_schedule = True
allow_trigger_in_future = False
trigger_timeout_check_interval = 15

[triggerer]
default_capacity = 1000

[kerberos]
ccache = /tmp/airflow_krb5_ccache
principal = airflow
reinit_frequency = 3600
kinit_path = kinit
keytab = airflow.keytab
forwardable = True
include_ip = True

[elasticsearch]
host =
log_id_template = {dag_id}-{task_id}-{run_id}-{map_index}-{try_number}
end_of_log_mark = end_of_log
frontend =
write_stdout = False
json_format = False
json_fields = asctime, filename, lineno, levelname, message
host_field = host
offset_field = offset

[elasticsearch_configs]
use_ssl = False
verify_certs = True

[kubernetes_executor]
pod_template_file =
worker_container_repository =
worker_container_tag =
namespace = default
delete_worker_pods = True
delete_worker_pods_on_failure = False
worker_pods_creation_batch_size = 1
multi_namespace_mode = False
in_cluster = True
kube_client_request_args =
delete_option_kwargs =
enable_tcp_keepalive = True
tcp_keep_idle = 120
tcp_keep_intvl = 30
tcp_keep_cnt = 6
verify_ssl = True
worker_pods_pending_timeout = 300
worker_pods_pending_timeout_check_interval = 120
worker_pods_queued_check_interval = 60
worker_pods_pending_timeout_batch_size = 100

[sensors]
default_timeout = 604800

Thanks for youe valuable help and please let know if there is any information you need.

@potiuk
Copy link
Member

potiuk commented May 30, 2023

This is your problem to solve. I am not reading all the details, but if you wish to continue this dicussion, then pleas opem a new one - you are piggybacking on someone's different error and different case. You are hijacking this closed issue for somewhat related but different issue.

And please do not add even more issues. You seem to have an issue with designing a company wide solution based on docker-compose that is quite a bit of beyond of "why the quick-start docker compose does not work as advertised". You seem to need a professional paid help on solving the problems if you won't be able to design it on your own.

I am not goiing to have time (in my free time) to solve the problem you have and design a solution that will work for your company and team but can tell you some assumptions of the image we have and point you to the right docs you can read to understand thoroughly what's going on so that you can solve it - and if you want to map into company wide solution for you, you need to map it to those assumptions as you see fit.

If you are trying to use the docker compose ("quick start") of ours to be able to do your "company wide" deployment you are mostly on your own to make it works well for your case (this is how docker-compose works) and you shoudl modify it to fit your needs. Our quick-start docker-compose is just a starting point and reference for someone to write their own (if they wish) and (as indicated in the docker-compose) it has plenthy of things that you will likely have to modify and design your own docker-compose that you need to do to make it works.

I think (but I will have no time to dive-deep into your solution - we are all here helping in our free time, so I can give at most some generic advice.

The image of airlfow works with the assumption that either you have "airflow" owned files and folders or "0" group owned ones (this is for open-shift compatibility). See all the details about it in the docs: https://airflow.apache.org/docs/docker-stack/entrypoint.html . What you happanes when you use different uid the entrypoint (and it is all described in the docs) creates a new user, makes it belongs to "0" group and sets it home to same as "airflow" user. And all the files and folders etc should be owned and created by the "0" group and read/write works for them to make it works. So if you wish to do it and share files and folders somehow you need make sure "0" group owns it. If you are using "airflow" user by default, you should just make sure that your volumes are read/write for "airflow" users.

How to do it exactly if you build some kind of sharing and git based on the docker compose is primarily your job to figure out, depends what you want to do. I have no ready-recipes here I am afrais.

@MingTaLee
Copy link

Thanks for your input and info. Will test with a fresh new VM to pinpoint the issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
invalid kind:bug This is a clearly a bug
Projects
None yet
Development

No branches or pull requests

3 participants