Skip to content
This repository has been archived by the owner on May 22, 2021. It is now read-only.

Fix missing dags ui #1

Open
wants to merge 808 commits into
base: master
Choose a base branch
from
Open

Fix missing dags ui #1

wants to merge 808 commits into from

Conversation

krzysztof-indyk
Copy link

Make sure you have checked all steps below.

Jira

  • My PR addresses the following Airflow Jira issues and references them in the PR title. For example, "[AIRFLOW-XXX] My Airflow PR"

Description

  • Here are some details about my PR, including screenshots of any UI changes:

Tests

  • My PR adds the following unit tests OR does not need testing for this extremely good reason:

Commits

  • My commits all reference Jira issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "How to write a good git commit message":
    1. Subject is separated from body by a blank line
    2. Subject is limited to 50 characters (not including Jira issue reference)
    3. Subject does not end with a period
    4. Subject uses the imperative mood ("add", not "adding")
    5. Body wraps at 72 characters
    6. Body explains "what" and "why", not "how"

Documentation

  • In case of new functionality, my PR adds documentation that describes how to use it.
    • When adding new operators/hooks/sensors, the autoclass documentation generation needs to be added.
    • All the public functions and the classes in the PR contain docstrings that explain what it does

Code Quality

  • Passes flake8

ryanyuan and others added 30 commits March 5, 2019 19:47
The test `SchedulerJobTest.test_scheduler_start_date` is the slowest test,
taking ~5 minutes on average:

    >>> [success] 12.99% tests.test_jobs.SchedulerJobTest.test_scheduler_start_date: 295.1935s
	[success] 6.79% tests.test_jobs.SchedulerJobTest.test_scheduler_multiprocessing: 154.2304s
	[success] 6.72% tests.test_jobs.SchedulerJobTest.test_scheduler_task_start_date: 152.7215s
	[success] 4.34% tests.test_jobs.SchedulerJobTest.test_new_import_error_replaces_old: 98.7339s
	[success] 3.63% tests.test_jobs.SchedulerJobTest.test_remove_error_clears_import_error: 82.4062s

After setting the subdirectory and eliminating (I think) redundant scheduler
loops, the test time comes down to ~15 seconds.
The BackfillJobTest suite now takes 57 seconds vs. the baseline of 147
seconds on my laptop.

A couple of optimizations:

- Don't sleep() if we are running unit tests
- Don't backfill more DagRuns than needed (reduced from 5 to 2, since we
  only need 2 DagRuns to verify that we can run backwards)

I've also made a few tests reentrant by clearing out the Pool, DagRun,
and TaskInstance table between runs.
shell=True is a security risk. Bash is not required to launch
tasks and will consume extra resources.
Caused by an update in PR apache#3740.
execute_command.apply_async(args=command, ...)
-command is a list of short unicode strings and the above code pass multiple
arguments to a function defined as taking only one argument.
-command = ["airflow", "run", "dag323",...]
-args = command = ["airflow", "run", "dag323", ...]
-execute_command("airflow","run","dag3s3", ...) will be error and exit.
…ache#4851)

This change allows users to seamlessly upgrade without a hard-to-debug
error when a task is actually run. This allows us to pull the change in
to 1.10.3
Fixes issues when specifying a DAG with a schedule_interval of type relativedelta.
* [AIRFLOW-3905] Allow 'parameters' in SqlSensor

* Add check on conn_type & add test

Not all SQL-related connections are supported by SqlSensor,
due to limitation in Connection model and hook implementation.
Add operators for transferring files between s3 and sftp.
Fix typos in various files
1. Copying:
Under the hood, it's `boto3.client.copy_object()`.
It can only handle the situation in which the
S3 connection used can access both source and
destination bucket/key.

2. Deleting:
2.1 Under the hood, it's `boto3.client.delete_objects()`.
It supports either deleting one single object or
multiple objects.
2.2 If users try to delete a non-existent object, the
request will still succeed, but there will be an
entry 'Errors' in the response. There may also be
other reasons which may cause similar 'Errors' (
request itself would succeed without explicit
exception). So an argument `silent_on_errors` is added
to let users decide if this sort of 'Errors' should
fail the operator.

The corresponding methods are added into S3Hook, and
these two operators are 'wrappers' of these methods.
- fixes up_for_retry and up_for_reschedule tasks when hovering over retry/rescheduled task state
- adds missing task states to stateFocusMap that will be used for highlighting tasks when clicking on task state
- removed invalid attributes for some tags
- reformatted accordingly to rules from .editorconfig

[AIRFLOW-3807] Fix no_status tasks highlighting in graph view

[AIRFLOW-3807] Change "no status" string to "no_status"

[AIRFLOW-3807] Fix syntax issue in js statement

[AIRFLOW-3807] Correct tree view tasks' status labels

- reformat tree.html file
- remove invalid attributes from tree.html tags
Some links are incorrect when base_url is set.
The SchedulerJobTest suite now takes ~90 seconds on my laptop (down from
~900 seconds == 15 minutes) on Jenkins.

There are a few optimizations here:

1. Don't sleep() for 1 second every scheduling loop (in unit tests)
2. Don't process the example DAGs
3. Use `subdir` to process only the DAGs we need, for a couple of tests
   that actually run the scheduler
4. Only load the DagBag once instead of before each test

I've also added a few tables to the list of tables that are cleaned up
in between test runs to make the tests re-entrant.
We're hitting this race condition frequently now that we don't sleep()
during unit tests. We don't actually need to assert that the task is
currently running - it's fine if it has already run successfully.
1. Renamed files:
- tests/configuration.py → tests/test_configuration.py
- tests/impersonation.py → tests/test_impersonation.py
- tests/utils.py → tests/test_utils.py
- tests/operators/operators.py → tests/operators/test_operators.py
- tests/operators/bash_operator.py → tests/operators/test_bash_operator.py
- tests/jobs.py → tests/test_jobs.py

2. Updated tests/__init__.py accordingly

3. Fixed database-specific tests in tests/operators/test_operators.py

4. Fixed issue in tests/operators/test_bash_operator.py
…any (apache#3986)

For all SQL-operators based on DbApiHook, sql command itself is printed
into log.info. But if parameters are used for the sql command, the
parameters would not be included in the printing. This makes the log
less useful.

This commit ensures that the parameters are also printed into the
log.info, if any.
ashb and others added 28 commits April 1, 2019 16:23
* [AIRFLOW-3947] Flash msg for no DAG-level access error

It will show and remind user when a user clicks on a DAG that
he/she doesn't have can_dag_read or can_dag_edit permissions.

* Change the flash msg contents
NOTE: This operator only transfers the latest attachment by name.
Sendgrid just released 6.0 with breaking changes, and I don't have the
time to define how to change our code or tests - as they haven't
published a migration guide :(
dagre-d3 v0.6.3 has a bug that causes this Javascript error when loading
the Graph View:

    TypeError: previousPaths.merge is not a function

The bug fix [1] has been merged to master, but hasn't been released to
npm yet. This change temporarily downgrades our version of dagre-d3
until dagre-d3 v0.6.4 is released [2]

I also fixed a bug I encountered in the `compile_assets.sh` where the
script would fail if the directory `airflow/www/static/dist/` exists but
is empty.

[1] dagrejs/dagre-d3#350
[2] https://github.com/dagrejs/dagre-d3/blob/5450627790ff42012ef50cef6b0e220199ae4fbe/package.json#L3
…he#5040)

- change method to upload data to s3 from load_string to load_bytes
…#5039)

To make the requests POSTs and to follow the redirect that the backend
issue I turned the "toggle" buttons in to an actual form, which makes
there much less logic needed to build up the URL - the browser handles
it all for us. The only thing we have to do is set the "action" on the
URL.

For the "link" ones (delete,trigger,refresh) I wrote a short
`postAsForm` which takes the URL and submits a form. A little bit messy,
but it works.
This reverts commit c1a23e6.

This is still useful for larger/more complex DAGs
Downstream tasks should run as long as their parents are in
`success`, `failed`, or `upstream_failed` states.
Unit tests added for PigCliHook as well to prevent
future issues.

Closes apache#3594 from jakahn/master
…lta SLAs (apache#4939)

Modify SchedulerJob.manage_slas to respect zero timedelta SLAs
* Clear runs for BackfillJobTest

* Fixing import

* Fixing flake8
apache#5111)

We used to key on `safe_dag_id` but in that got changed in apache#4368 to use a
more efficient query, which broke for non-sub-DAGs that contain a `.` in their id.

Arguably this escaping is a front-end concern anyway, so handling the escaping in the front end makes sense anyway.

(cherry picked from commit c696dd4)
galuszkak pushed a commit that referenced this pull request Jan 23, 2020
…n using job_flow_name and no cluster is found (apache#6898)

* [AIRFLOW-6432] fixes in EmrAddStepsOperator

fix EmrAddStepsOperator broken ref & faulty test

* changes after CR #1

* Add exception and test case

* Update airflow/contrib/hooks/emr_hook.py

Co-Authored-By: Tomek Urbaszek <[email protected]>

* Update airflow/contrib/hooks/emr_hook.py

Co-Authored-By: Tomek Urbaszek <[email protected]>

* Update airflow/contrib/operators/emr_add_steps_operator.py

Co-Authored-By: Tomek Urbaszek <[email protected]>

* Update airflow/contrib/hooks/emr_hook.py

Co-Authored-By: Tomek Urbaszek <[email protected]>

* Update tests/contrib/operators/test_emr_add_steps_operator.py

Co-Authored-By: Tomek Urbaszek <[email protected]>

* changes after CR apache#2

Co-authored-by: Tomek Urbaszek <[email protected]>
galuszkak pushed a commit that referenced this pull request Mar 5, 2020
…n using job_flow_name and no cluster is found (apache#6898)

* [AIRFLOW-6432] fixes in EmrAddStepsOperator

fix EmrAddStepsOperator broken ref & faulty test

* changes after CR #1

* Add exception and test case

* Update airflow/contrib/hooks/emr_hook.py

Co-Authored-By: Tomek Urbaszek <[email protected]>

* Update airflow/contrib/hooks/emr_hook.py

Co-Authored-By: Tomek Urbaszek <[email protected]>

* Update airflow/contrib/operators/emr_add_steps_operator.py

Co-Authored-By: Tomek Urbaszek <[email protected]>

* Update airflow/contrib/hooks/emr_hook.py

Co-Authored-By: Tomek Urbaszek <[email protected]>

* Update tests/contrib/operators/test_emr_add_steps_operator.py

Co-Authored-By: Tomek Urbaszek <[email protected]>

* changes after CR apache#2

Co-authored-by: Tomek Urbaszek <[email protected]>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.