diff --git a/airflow-core/docs/administration-and-deployment/listeners.rst b/airflow-core/docs/administration-and-deployment/listeners.rst index e691ff63ce421..b1ccf181d0c91 100644 --- a/airflow-core/docs/administration-and-deployment/listeners.rst +++ b/airflow-core/docs/administration-and-deployment/listeners.rst @@ -131,7 +131,7 @@ Airflow defines the specification as `hookspec `. -Listener API is meant to be called across all dags and all operators. You can't listen to events generated by specific dags. For that behavior, try methods like ``on_success_callback`` and ``pre_execute``. These provide callbacks for particular DAG authors or operator creators. The logs and ``print()`` calls will be handled as part of the listeners. +Listener API is meant to be called across all dags and all operators. You can't listen to events generated by specific dags. For that behavior, try methods like ``on_success_callback`` and ``pre_execute``. These provide callbacks for particular Dag authors or operator creators. The logs and ``print()`` calls will be handled as part of the listeners. Compatibility note diff --git a/airflow-core/docs/authoring-and-scheduling/deferring.rst b/airflow-core/docs/authoring-and-scheduling/deferring.rst index 08a9058e4c623..6cee71195da5f 100644 --- a/airflow-core/docs/authoring-and-scheduling/deferring.rst +++ b/airflow-core/docs/authoring-and-scheduling/deferring.rst @@ -31,7 +31,7 @@ An overview of how this process works: * The trigger runs until it fires, at which point its source task is re-scheduled by the scheduler. * The scheduler queues the task to resume on a worker node. -You can either use pre-written deferrable operators as a DAG author or write your own. Writing them, however, requires that they meet certain design criteria. +You can either use pre-written deferrable operators as a Dag author or write your own. Writing them, however, requires that they meet certain design criteria. Using Deferrable Operators -------------------------- diff --git a/airflow-core/docs/authoring-and-scheduling/dynamic-task-mapping.rst b/airflow-core/docs/authoring-and-scheduling/dynamic-task-mapping.rst index 03119f411a6df..e254bf52071f3 100644 --- a/airflow-core/docs/authoring-and-scheduling/dynamic-task-mapping.rst +++ b/airflow-core/docs/authoring-and-scheduling/dynamic-task-mapping.rst @@ -21,7 +21,7 @@ Dynamic Task Mapping ==================== -Dynamic Task Mapping allows a way for a workflow to create a number of tasks at runtime based upon current data, rather than the DAG author having to know in advance how many tasks would be needed. +Dynamic Task Mapping allows a way for a workflow to create a number of tasks at runtime based upon current data, rather than the Dag author having to know in advance how many tasks would be needed. This is similar to defining your tasks in a for loop, but instead of having the DAG file fetch the data and do that itself, the scheduler can do this based on the output of a previous task. Right before a mapped task is executed the scheduler will create *n* copies of the task, one for each input. diff --git a/airflow-core/docs/best-practices.rst b/airflow-core/docs/best-practices.rst index f75d180cad3b0..839c95dbfe94f 100644 --- a/airflow-core/docs/best-practices.rst +++ b/airflow-core/docs/best-practices.rst @@ -954,12 +954,12 @@ The benefits of the operator are: Airflow dependencies) to make use of multiple virtual environments * You can run tasks with different sets of dependencies on the same workers - thus Memory resources are reused (though see below about the CPU overhead involved in creating the venvs). -* In bigger installations, DAG Authors do not need to ask anyone to create the venvs for you. - As a DAG Author, you only have to have virtualenv dependency installed and you can specify and modify the +* In bigger installations, Dag authors do not need to ask anyone to create the venvs for you. + As a Dag author, you only have to have virtualenv dependency installed and you can specify and modify the environments as you see fit. * No changes in deployment requirements - whether you use Local virtualenv, or Docker, or Kubernetes, the tasks will work without adding anything to your deployment. -* No need to learn more about containers, Kubernetes as a DAG Author. Only knowledge of Python requirements +* No need to learn more about containers, Kubernetes as a Dag author. Only knowledge of Python requirements is required to author dags this way. There are certain limitations and overhead introduced by this operator: @@ -1005,7 +1005,7 @@ and available in all the workers in case your Airflow runs in a distributed envi This way you avoid the overhead and problems of re-creating the virtual environment but they have to be prepared and deployed together with Airflow installation. Usually people who manage Airflow installation -need to be involved, and in bigger installations those are usually different people than DAG Authors +need to be involved, and in bigger installations those are usually different people than Dag authors (DevOps/System Admins). Those virtual environments can be prepared in various ways - if you use LocalExecutor they just need to be installed @@ -1024,7 +1024,7 @@ The benefits of the operator are: be added dynamically. This is good for both, security and stability. * Limited impact on your deployment - you do not need to switch to Docker containers or Kubernetes to make a good use of the operator. -* No need to learn more about containers, Kubernetes as a DAG Author. Only knowledge of Python, requirements +* No need to learn more about containers, Kubernetes as a Dag author. Only knowledge of Python, requirements is required to author dags this way. The drawbacks: @@ -1045,7 +1045,7 @@ The drawbacks: same worker might be affected by previous tasks creating/modifying files etc. You can think about the ``PythonVirtualenvOperator`` and ``ExternalPythonOperator`` as counterparts - -that make it smoother to move from development phase to production phase. As a DAG author you'd normally +that make it smoother to move from development phase to production phase. As a Dag author you'd normally iterate with dependencies and develop your DAG using ``PythonVirtualenvOperator`` (thus decorating your tasks with ``@task.virtualenv`` decorators) while after the iteration and changes you would likely want to change it for production to switch to the ``ExternalPythonOperator`` (and ``@task.external_python``) diff --git a/airflow-core/docs/core-concepts/overview.rst b/airflow-core/docs/core-concepts/overview.rst index 82f669bcd119c..96eda6bdff1e4 100644 --- a/airflow-core/docs/core-concepts/overview.rst +++ b/airflow-core/docs/core-concepts/overview.rst @@ -98,7 +98,7 @@ and can be scaled by running multiple instances of the components above. The separation of components also allow for increased security, by isolating the components from each other and by allowing to perform different tasks. For example separating *dag processor* from *scheduler* allows to make sure that the *scheduler* does not have access to the *DAG files* and cannot execute -code provided by *DAG author*. +code provided by *Dag author*. Also while single person can run and manage Airflow installation, Airflow Deployment in more complex setup can involve various roles of users that can interact with different parts of the system, which is @@ -106,7 +106,7 @@ an important aspect of secure Airflow deployment. The roles are described in det :doc:`/security/security_model` and generally speaking include: * Deployment Manager - a person that installs and configures Airflow and manages the deployment -* DAG author - a person that writes dags and submits them to Airflow +* Dag author - a person that writes dags and submits them to Airflow * Operations User - a person that triggers dags and tasks and monitors their execution Architecture Diagrams @@ -153,13 +153,13 @@ Distributed Airflow architecture ................................ This is the architecture of Airflow where components of Airflow are distributed among multiple machines -and where various roles of users are introduced - *Deployment Manager*, **DAG author**, +and where various roles of users are introduced - *Deployment Manager*, **Dag author**, **Operations User**. You can read more about those various roles in the :doc:`/security/security_model`. In the case of a distributed deployment, it is important to consider the security aspects of the components. The *webserver* does not have access to the *DAG files* directly. The code in the ``Code`` tab of the UI is read from the *metadata database*. The *webserver* cannot execute any code submitted by the -**DAG author**. It can only execute code that is installed as an *installed package* or *plugin* by +**Dag author**. It can only execute code that is installed as an *installed package* or *plugin* by the **Deployment Manager**. The **Operations User** only has access to the UI and can only trigger dags and tasks, but cannot author dags. @@ -178,7 +178,7 @@ Separate DAG processing architecture In a more complex installation where security and isolation are important, you'll also see the standalone *dag processor* component that allows to separate *scheduler* from accessing *DAG files*. This is suitable if the deployment focus is on isolation between parsed tasks. While Airflow does not yet -support full multi-tenant features, it can be used to make sure that **DAG author** provided code is never +support full multi-tenant features, it can be used to make sure that **Dag author** provided code is never executed in the context of the scheduler. .. image:: ../img/diagram_dag_processor_airflow_architecture.png diff --git a/airflow-core/docs/core-concepts/params.rst b/airflow-core/docs/core-concepts/params.rst index f6d8a2c5c7a89..2ce7403df6747 100644 --- a/airflow-core/docs/core-concepts/params.rst +++ b/airflow-core/docs/core-concepts/params.rst @@ -191,7 +191,7 @@ JSON Schema Validation .. note:: If ``schedule`` is defined for a DAG, params with defaults must be valid. This is validated during DAG parsing. If ``schedule=None`` then params are not validated during DAG parsing but before triggering a DAG. - This is useful in cases where the DAG author does not want to provide defaults but wants to force users provide valid parameters + This is useful in cases where the Dag author does not want to provide defaults but wants to force users provide valid parameters at time of trigger. .. note:: diff --git a/airflow-core/docs/installation/index.rst b/airflow-core/docs/installation/index.rst index a72f44fb40fd7..f6a327444a27c 100644 --- a/airflow-core/docs/installation/index.rst +++ b/airflow-core/docs/installation/index.rst @@ -347,7 +347,7 @@ The requirements that Airflow might need depend on many factors, including (but the technology/cloud/integration of monitoring etc. * Technical details of database, hardware, network, etc. that your deployment is running on * The complexity of the code you add to your DAGS, configuration, plugins, settings etc. (note, that - Airflow runs the code that DAG author and Deployment Manager provide) + Airflow runs the code that Dag author and Deployment Manager provide) * The number and choice of providers you install and use (Airflow has more than 80 providers) that can be installed by choice of the Deployment Manager and using them might require more resources. * The choice of parameters that you use when tuning Airflow. Airflow has many configuration parameters diff --git a/airflow-core/docs/installation/upgrading_to_airflow3.rst b/airflow-core/docs/installation/upgrading_to_airflow3.rst index 1a6603f324c1b..64d5a02ed9ad0 100644 --- a/airflow-core/docs/installation/upgrading_to_airflow3.rst +++ b/airflow-core/docs/installation/upgrading_to_airflow3.rst @@ -50,7 +50,7 @@ Airflow 3.x Architecture Database Access Restrictions ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -In Airflow 3, direct metadata database access from task code is now restricted. This is a key security and architectural improvement that affects how DAG authors interact with Airflow resources: +In Airflow 3, direct metadata database access from task code is now restricted. This is a key security and architectural improvement that affects how Dag authors interact with Airflow resources: - **No Direct Database Access**: Task code can no longer directly import and use Airflow database sessions or models. - **API-Based Resource Access**: All runtime interactions (state transitions, heartbeats, XComs, and resource fetching) are handled through a dedicated Task Execution API. @@ -83,7 +83,7 @@ Step 2: Clean and back up your existing Airflow Instance ensure you deploy your changes to your old instance prior to upgrade, and wait until your dags have all been reprocessed (and all errors gone) before you proceed with upgrade. -Step 3: Dag Authors - Check your Airflow dags for compatibility +Step 3: Dag authors - Check your Airflow dags for compatibility ---------------------------------------------------------------- To minimize friction for users upgrading from prior versions of Airflow, we have created a dag upgrade check utility using `Ruff `_ combined with `AIR `_ rules. diff --git a/airflow-core/docs/public-airflow-interface.rst b/airflow-core/docs/public-airflow-interface.rst index 0f685c16ccea4..b8902a8cd8e2c 100644 --- a/airflow-core/docs/public-airflow-interface.rst +++ b/airflow-core/docs/public-airflow-interface.rst @@ -36,8 +36,8 @@ and extending Airflow capabilities by writing new executors, plugins, operators Public Interface can be useful for building custom tools and integrations with other systems, and for automating certain aspects of the Airflow workflow. -The primary public interface for DAG Authors and task execution is using task SDK -Airflow task SDK is the primary public interface for DAG Authors and for task execution +The primary public interface for Dag authors and task execution is using task SDK +Airflow task SDK is the primary public interface for Dag authors and for task execution :doc:`airflow.sdk namespace `. Direct access to the metadata database from task code is no longer allowed. Instead, use the :doc:`Stable REST API `, `Python Client `_, or Task Context methods. @@ -87,12 +87,12 @@ in details (such as output format and available flags) so if you want to rely on way, the Stable REST API is recommended. -Using the Public Interface for DAG Authors +Using the Public Interface for Dag authors ========================================== -The primary interface for DAG Authors is the :doc:`airflow.sdk namespace `. +The primary interface for Dag authors is the :doc:`airflow.sdk namespace `. This provides a stable, well-defined interface for creating DAGs and tasks that is not subject to internal -implementation changes. The goal of this change is to decouple DAG authoring from Airflow internals (Scheduler, +implementation changes. The goal of this change is to decouple Dag authoring from Airflow internals (Scheduler, API Server, etc.), providing a version-agnostic, stable interface for writing and maintaining DAGs across Airflow versions. **Key Imports from airflow.sdk:** @@ -164,17 +164,17 @@ You can read more about dags in :doc:`Dags `. References for the modules used in dags are here: .. note:: - The airflow.sdk namespace provides the primary interface for DAG Authors. + The airflow.sdk namespace provides the primary interface for Dag authors. For detailed API documentation, see the `Task SDK Reference `_. .. note:: The :class:`~airflow.models.dagbag.DagBag` class is used internally by Airflow for loading DAGs - from files and folders. DAG Authors should use the :class:`~airflow.sdk.DAG` class from the + from files and folders. Dag authors should use the :class:`~airflow.sdk.DAG` class from the airflow.sdk namespace instead. .. note:: The :class:`~airflow.models.dagrun.DagRun` class is used internally by Airflow for DAG run - management. DAG Authors should access DAG run information through the Task Context via + management. Dag authors should access DAG run information through the Task Context via :func:`~airflow.sdk.get_current_context` or use the :class:`~airflow.sdk.types.DagRunProtocol` interface. @@ -231,7 +231,7 @@ Example of accessing task instance information through Task Context: .. note:: The :class:`~airflow.models.taskinstancekey.TaskInstanceKey` class is used internally by Airflow - for identifying task instances. DAG Authors should access task instance information through the + for identifying task instances. Dag authors should access task instance information through the Task Context via :func:`~airflow.sdk.get_current_context` instead. @@ -257,7 +257,7 @@ by extending them: Public Airflow utilities ======================== -When writing or extending Hooks and Operators, DAG Authors and developers can +When writing or extending Hooks and Operators, Dag authors and developers can use the following classes: * The :class:`~airflow.sdk.Connection`, which provides access to external service credentials and configuration. @@ -485,10 +485,10 @@ implemented in the community providers. Decorators ========== -DAG Authors can use decorators to author dags using the :doc:`TaskFlow ` concept. +Dag authors can use decorators to author dags using the :doc:`TaskFlow ` concept. All Decorators derive from :class:`~airflow.sdk.bases.decorator.TaskDecorator`. -The primary decorators for DAG Authors are now in the airflow.sdk namespace: +The primary decorators for Dag authors are now in the airflow.sdk namespace: :func:`~airflow.sdk.dag`, :func:`~airflow.sdk.task`, :func:`~airflow.sdk.asset`, :func:`~airflow.sdk.setup`, :func:`~airflow.sdk.task_group`, :func:`~airflow.sdk.teardown`, :func:`~airflow.sdk.chain`, :func:`~airflow.sdk.chain_linear`, :func:`~airflow.sdk.cross_downstream`, diff --git a/airflow-core/docs/security/security_model.rst b/airflow-core/docs/security/security_model.rst index 2ebff598c54bd..ff9fb62f4c1d7 100644 --- a/airflow-core/docs/security/security_model.rst +++ b/airflow-core/docs/security/security_model.rst @@ -39,7 +39,7 @@ This is why Airflow has the following user types: * Deployment Managers - overall responsible for the Airflow installation, security and configuration * Authenticated UI users - users that can access Airflow UI and API and interact with it -* DAG Authors - responsible for creating dags and submitting them to Airflow +* Dag authors - responsible for creating dags and submitting them to Airflow You can see more on how the user types influence Airflow's architecture in :doc:`/core-concepts/overview`, including, seeing the diagrams of less and more complex deployments. @@ -58,14 +58,14 @@ can also decide to keep audits, backups and copies of information outside of Airflow, which are not covered by Airflow's security model. -DAG Authors +Dag authors ........... They can create, modify, and delete DAG files. The code in DAG files is executed on workers and in the DAG Processor. -Therefore, DAG authors can create and change code executed on workers +Therefore, Dag authors can create and change code executed on workers and the DAG Processor and potentially access the credentials that the DAG -code uses to access external systems. DAG Authors have full access +code uses to access external systems. Dag authors have full access to the metadata database. Authenticated UI users @@ -146,12 +146,12 @@ Viewers also do not have permission to access audit logs. For more information on the capabilities of authenticated UI users, see :doc:`apache-airflow-providers-fab:auth-manager/access-control`. -Capabilities of DAG Authors +Capabilities of Dag authors --------------------------- -DAG authors are able to create or edit code - via Python files placed in a dag bundle - that will be executed +Dag authors are able to create or edit code - via Python files placed in a dag bundle - that will be executed in a number of circumstances. The code to execute is neither verified, checked nor sand-boxed by Airflow -(that would be very difficult if not impossible to do), so effectively DAG authors can execute arbitrary +(that would be very difficult if not impossible to do), so effectively Dag authors can execute arbitrary code on the workers (part of Celery Workers for Celery Executor, local processes run by scheduler in case of Local Executor, Task Kubernetes POD in case of Kubernetes Executor), in the DAG Processor and in the Triggerer. @@ -161,80 +161,80 @@ There are several consequences of this model chosen by Airflow, that deployment Local executor .............. -In case of Local Executor, DAG authors can execute arbitrary code on the machine where scheduler is running. +In case of Local Executor, Dag authors can execute arbitrary code on the machine where scheduler is running. This means that they can affect the scheduler process itself, and potentially affect the whole Airflow installation - including modifying cluster-wide policies and changing Airflow configuration. If you are running -Airflow with Local Executor, the Deployment Manager must trust the DAG authors not to abuse this capability. +Airflow with Local Executor, the Deployment Manager must trust the Dag authors not to abuse this capability. Celery Executor ............... -In case of Celery Executor, DAG authors can execute arbitrary code on the Celery Workers. This means that +In case of Celery Executor, Dag authors can execute arbitrary code on the Celery Workers. This means that they can potentially influence all the tasks executed on the same worker. If you are running Airflow with -Celery Executor, the Deployment Manager must trust the DAG authors not to abuse this capability and unless +Celery Executor, the Deployment Manager must trust the Dag authors not to abuse this capability and unless Deployment Manager separates task execution by queues by Cluster Policies, they should assume, there is no isolation between tasks. Kubernetes Executor ................... -In case of Kubernetes Executor, DAG authors can execute arbitrary code on the Kubernetes POD they run. Each +In case of Kubernetes Executor, Dag authors can execute arbitrary code on the Kubernetes POD they run. Each task is executed in a separate POD, so there is already isolation between tasks as generally speaking Kubernetes provides isolation between PODs. Triggerer ......... -In case of Triggerer, DAG authors can execute arbitrary code in Triggerer. Currently there are no +In case of Triggerer, Dag authors can execute arbitrary code in Triggerer. Currently there are no enforcement mechanisms that would allow to isolate tasks that are using deferrable functionality from each other and arbitrary code from various tasks can be executed in the same process/machine. Deployment -Manager must trust that DAG authors will not abuse this capability. +Manager must trust that Dag authors will not abuse this capability. DAG files not needed for Scheduler and API Server ................................................. -The Deployment Manager might isolate the code execution provided by DAG authors - particularly in +The Deployment Manager might isolate the code execution provided by Dag authors - particularly in Scheduler and API Server by making sure that the Scheduler and API Server don't even -have access to the DAG Files. Generally speaking - no DAG author provided code should ever be +have access to the DAG Files. Generally speaking - no Dag author provided code should ever be executed in the Scheduler or API Server process. This means the deployment manager can exclude credentials needed for dag bundles on the Scheduler and API Server - but the bundles must still be configured on those components. -Allowing DAG authors to execute selected code in Scheduler and API Server +Allowing Dag authors to execute selected code in Scheduler and API Server ......................................................................... -There are a number of functionalities that allow the DAG author to use pre-registered custom code to be +There are a number of functionalities that allow the Dag author to use pre-registered custom code to be executed in the Scheduler or API Server process - for example they can choose custom Timetables, UI plugins, Connection UI Fields, Operator extra links, macros, listeners - all of those functionalities allow the -DAG author to choose the code that will be executed in the Scheduler or API Server process. However this -should not be arbitrary code that DAG author can add dag bundles. All those functionalities are +Dag author to choose the code that will be executed in the Scheduler or API Server process. However this +should not be arbitrary code that Dag author can add dag bundles. All those functionalities are only available via ``plugins`` and ``providers`` mechanisms where the code that is executed can only be provided by installed packages (or in case of plugins it can also be added to PLUGINS folder where DAG authors should not have write access to). PLUGINS_FOLDER is a legacy mechanism coming from Airflow 1.10 - but we recommend using entrypoint mechanism that allows the Deployment Manager to - effectively - -choose and register the code that will be executed in those contexts. DAG Author has no access to +choose and register the code that will be executed in those contexts. Dag author has no access to install or modify packages installed in Scheduler and API Server, and this is the way to prevent -the DAG Author to execute arbitrary code in those processes. +the Dag author to execute arbitrary code in those processes. Additionally, if you decide to utilize and configure the PLUGINS_FOLDER, it is essential for the Deployment -Manager to ensure that the DAG author does not have write access to this folder. +Manager to ensure that the Dag author does not have write access to this folder. -The Deployment Manager might decide to introduce additional control mechanisms to prevent DAG authors from +The Deployment Manager might decide to introduce additional control mechanisms to prevent Dag authors from executing arbitrary code. This is all fully in hands of the Deployment Manager and it is discussed in the following chapter. Access to all dags ........................................................................ -All dag authors have access to all dags in the Airflow deployment. This means that they can view, modify, +All Dag authors have access to all dags in the Airflow deployment. This means that they can view, modify, and update any dag without restrictions at any time. Responsibilities of Deployment Managers --------------------------------------- -As a Deployment Manager, you should be aware of the capabilities of DAG authors and make sure that +As a Deployment Manager, you should be aware of the capabilities of Dag authors and make sure that you trust them not to abuse the capabilities they have. You should also make sure that you have -properly configured the Airflow installation to prevent DAG authors from executing arbitrary code +properly configured the Airflow installation to prevent Dag authors from executing arbitrary code in the Scheduler and API Server processes. Deploying and protecting Airflow installation @@ -252,10 +252,10 @@ Airflow is deployed. This includes but is not limited to: * any kind of detection of unusual activity and protection against it * choosing the right session backend and configuring it properly including timeouts for the session -Limiting DAG Author capabilities +Limiting Dag author capabilities ................................. -The Deployment Manager might also use additional mechanisms to prevent DAG authors from executing +The Deployment Manager might also use additional mechanisms to prevent Dag authors from executing arbitrary code - for example they might introduce tooling around DAG submission that would allow to review the code before it is deployed, statically-check it and add other ways to prevent malicious code to be submitted. The way submitting code to a DAG bundle is done and protected is completely diff --git a/airflow-core/docs/tutorial/pipeline.rst b/airflow-core/docs/tutorial/pipeline.rst index cc6f41e497749..4928064401f9f 100644 --- a/airflow-core/docs/tutorial/pipeline.rst +++ b/airflow-core/docs/tutorial/pipeline.rst @@ -34,7 +34,7 @@ By the end of this tutorial, you'll have a working pipeline that: - Loads the data into a staging table - Cleans the data and upserts it into a target table -Along the way, you'll gain hands-on experience with Airflow's UI, connection system, SQL execution, and DAG authoring +Along the way, you'll gain hands-on experience with Airflow's UI, connection system, SQL execution, and Dag authoring patterns. Want to go deeper as you go? Here are two helpful references: diff --git a/contributing-docs/05_pull_requests.rst b/contributing-docs/05_pull_requests.rst index 5a7a4d96e13b4..5b7c5f93f9510 100644 --- a/contributing-docs/05_pull_requests.rst +++ b/contributing-docs/05_pull_requests.rst @@ -177,7 +177,7 @@ To make this easier, there is the ``create_session`` helper: .. warning:: **DO NOT** add a default to the ``session`` argument **unless** ``@provide_session`` is used. -If this function is designed to be called by "end-users" (i.e. DAG authors) then using the ``@provide_session`` wrapper is okay: +If this function is designed to be called by "end-users" (i.e. Dag authors) then using the ``@provide_session`` wrapper is okay: .. code-block:: python diff --git a/task-sdk/docs/concepts.rst b/task-sdk/docs/concepts.rst index 98e17b6c4f64f..25b45ba305acf 100644 --- a/task-sdk/docs/concepts.rst +++ b/task-sdk/docs/concepts.rst @@ -18,7 +18,7 @@ Concepts ======== -This section covers the fundamental concepts that DAG authors need to understand when working with the Task SDK. +This section covers the fundamental concepts that Dag authors need to understand when working with the Task SDK. .. note:: @@ -32,7 +32,7 @@ Terminology Task Lifecycle -------------- -Understanding the task lifecycle helps DAG authors write more effective tasks and debug issues: +Understanding the task lifecycle helps Dag authors write more effective tasks and debug issues: - **Scheduled**: The Airflow scheduler enqueues the task instance. The Executor assigns a workload token used for subsequent API authentication and validation with the Airflow API Server. - **Queued**: Workers poll the queue to retrieve and reserve queued task instances. diff --git a/task-sdk/docs/index.rst b/task-sdk/docs/index.rst index 4c699f9f97f42..6c4621dd12241 100644 --- a/task-sdk/docs/index.rst +++ b/task-sdk/docs/index.rst @@ -24,7 +24,7 @@ executing tasks in isolated subprocesses and interacting with Airflow resources It also includes core execution-time components to manage communication between the worker and the Airflow scheduler/backend. -The goal of task-sdk is to decouple DAG authoring from Airflow internals (Scheduler, API Server, etc.), providing a forward-compatible, stable interface for writing and maintaining DAGs across Airflow versions. This approach reduces boilerplate and keeps your DAG definitions concise and readable. +The goal of task-sdk is to decouple Dag authoring from Airflow internals (Scheduler, API Server, etc.), providing a forward-compatible, stable interface for writing and maintaining DAGs across Airflow versions. This approach reduces boilerplate and keeps your DAG definitions concise and readable. 1. Introduction and Getting Started ----------------------------------- @@ -59,9 +59,9 @@ Airflow now supports a service-oriented architecture, enabling tasks to be execu To support remote execution, Airflow provides the Task SDK — a lightweight runtime environment for running Airflow tasks in external systems such as containers, edge environments, or other runtimes. This lays the groundwork for language-agnostic task execution and brings improved isolation, portability, and extensibility to Airflow-based workflows. -Airflow 3.0 also introduces a new ``airflow.sdk`` namespace that exposes the core authoring interfaces for defining DAGs and tasks. DAG authors should now import objects like :class:`airflow.sdk.DAG`, :func:`airflow.sdk.dag`, and :func:`airflow.sdk.task` from ``airflow.sdk`` rather than internal modules. This new namespace provides a stable, forward-compatible interface for DAG authoring across future versions of Airflow. +Airflow 3.0 also introduces a new ``airflow.sdk`` namespace that exposes the core authoring interfaces for defining DAGs and tasks. Dag authors should now import objects like :class:`airflow.sdk.DAG`, :func:`airflow.sdk.dag`, and :func:`airflow.sdk.task` from ``airflow.sdk`` rather than internal modules. This new namespace provides a stable, forward-compatible interface for Dag authoring across future versions of Airflow. -3. DAG Authoring Enhancements +3. Dag authoring Enhancements ----------------------------- Writing your DAGs is now more consistent in Airflow 3.0. Use the stable :mod:`airflow.sdk` interface to define your workflows and tasks. @@ -106,7 +106,7 @@ Why use ``airflow.sdk``? - :func:`airflow.sdk.get_current_context` - :func:`airflow.sdk.get_parsing_context` -All DAGs must update their imports to refer to ``airflow.sdk`` instead of using internal Airflow modules directly. Deprecated legacy import paths, such as ``airflow.models.dag.DAG`` and ``airflow.decorator.task``, will be removed in a future version of Airflow. Some utilities and helper functions currently used from ``airflow.utils.*`` and other modules will gradually be migrated to the Task SDK over the next minor releases. These upcoming updates aim to completely separate DAG creation from internal Airflow services. DAG authors can look forward to continuous improvements to airflow.sdk, with no backwards-incompatible changes to their existing code. +All DAGs must update their imports to refer to ``airflow.sdk`` instead of using internal Airflow modules directly. Deprecated legacy import paths, such as ``airflow.models.dag.DAG`` and ``airflow.decorator.task``, will be removed in a future version of Airflow. Some utilities and helper functions currently used from ``airflow.utils.*`` and other modules will gradually be migrated to the Task SDK over the next minor releases. These upcoming updates aim to completely separate DAG creation from internal Airflow services. Dag authors can look forward to continuous improvements to airflow.sdk, with no backwards-incompatible changes to their existing code. Legacy imports (deprecated): @@ -131,7 +131,7 @@ Explore a variety of DAG examples and patterns in the :doc:`examples` page. 5. Concepts ----------- -Discover the fundamental concepts that DAG authors need to understand when working with the Task SDK, including Airflow 2.x vs 3.x architectural differences, database access restrictions, and task lifecycle. For full details, see the :doc:`concepts` page. +Discover the fundamental concepts that Dag authors need to understand when working with the Task SDK, including Airflow 2.x vs 3.x architectural differences, database access restrictions, and task lifecycle. For full details, see the :doc:`concepts` page. Airflow 2.x Architecture ^^^^^^^^^^^^^^^^^^^^^^^^