Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Runtime error handling in the Fetch Notice Step #543

Open
cristianvasquez opened this issue Jun 25, 2024 · 0 comments
Open

Runtime error handling in the Fetch Notice Step #543

cristianvasquez opened this issue Jun 25, 2024 · 0 comments

Comments

@cristianvasquez
Copy link

In the case of runtime problems with the API, such as signature change or connectivity, Airflow signals the process as a success. This poses a problem as we won’t know when the pipeline stops working.

*** Reading local file: /opt/airflow/logs/dag_id=fetch_notices_by_date/run_id=manual__2024-06-14T08:48:14.612626+00:00/task_id=fetch_by_date_notice_from_ted/attempt=1.log
[2024-06-14, 08:48:16 UTC] {taskinstance.py:1083} INFO - Dependencies all met for <TaskInstance: fetch_notices_by_date.fetch_by_date_notice_from_ted manual__2024-06-14T08:48:14.612626+00:00 [queued]>
[2024-06-14, 08:48:16 UTC] {taskinstance.py:1083} INFO - Dependencies all met for <TaskInstance: fetch_notices_by_date.fetch_by_date_notice_from_ted manual__2024-06-14T08:48:14.612626+00:00 [queued]>
[2024-06-14, 08:48:16 UTC] {taskinstance.py:1279} INFO - 
--------------------------------------------------------------------------------
[2024-06-14, 08:48:16 UTC] {taskinstance.py:1280} INFO - Starting attempt 1 of 1
[2024-06-14, 08:48:16 UTC] {taskinstance.py:1281} INFO - 
--------------------------------------------------------------------------------
[2024-06-14, 08:48:16 UTC] {taskinstance.py:1300} INFO - Executing <Task(_PythonDecoratedOperator): fetch_by_date_notice_from_ted> on 2024-06-14 08:48:14.612626+00:00
[2024-06-14, 08:48:16 UTC] {standard_task_runner.py:55} INFO - Started process 151 to run task
[2024-06-14, 08:48:16 UTC] {standard_task_runner.py:82} INFO - Running: ['airflow', 'tasks', 'run', 'fetch_notices_by_date', 'fetch_by_date_notice_from_ted', 'manual__2024-06-14T08:48:14.612626+00:00', '--job-id', '43154', '--raw', '--subdir', 'DAGS_FOLDER/fetch_notices_by_date.py', '--cfg-path', '/tmp/tmp4nni5_q_']
[2024-06-14, 08:48:16 UTC] {standard_task_runner.py:83} INFO - Job 43154: Subtask fetch_by_date_notice_from_ted
[2024-06-14, 08:48:16 UTC] {warnings.py:109} WARNING - /home/airflow/.local/lib/python3.8/site-packages/airflow/settings.py:249: DeprecationWarning: The sql_alchemy_conn option in [core] has been moved to the sql_alchemy_conn option in [database] - the old setting has been used, but please update your config.
  SQL_ALCHEMY_CONN = conf.get("database", "SQL_ALCHEMY_CONN")

[2024-06-14, 08:48:16 UTC] {task_command.py:388} INFO - Running <TaskInstance: fetch_notices_by_date.fetch_by_date_notice_from_ted manual__2024-06-14T08:48:14.612626+00:00 [running]> on host ip-10-68-154-167.eu-west-1.compute.internal
[2024-06-14, 08:48:16 UTC] {taskinstance.py:1507} INFO - Exporting the following env vars:
[email protected]
AIRFLOW_CTX_DAG_OWNER=airflow
AIRFLOW_CTX_DAG_ID=fetch_notices_by_date
AIRFLOW_CTX_TASK_ID=fetch_by_date_notice_from_ted
AIRFLOW_CTX_EXECUTION_DATE=2024-06-14T08:48:14.612626+00:00
AIRFLOW_CTX_TRY_NUMBER=1
AIRFLOW_CTX_DAG_RUN_ID=manual__2024-06-14T08:48:14.612626+00:00
[2024-06-14, 08:48:17 UTC] {log.py:232} WARNING - [2024-06-14 08:48:17] - ConsoleHandler:ROOT - WARNING - EventMessage :: {'message': 'Request returned status code 404, retrying in 1 seconds!', 'created_at': datetime.datetime(2024, 6, 14, 8, 48, 16, 846241), 'year': 2024, 'month': 6, 'day': 14, 'severity_level': 30, 'caller_name': 'log_warning', 'started_at': datetime.datetime(2024, 6, 14, 8, 48, 16, 846225), 'ended_at': datetime.datetime(2024, 6, 14, 8, 48, 16, 846228), 'duration': None, 'metadata': None, 'kwargs': None}
[2024-06-14, 08:48:18 UTC] {log.py:232} WARNING - [2024-06-14 08:48:18] - ConsoleHandler:ROOT - WARNING - EventMessage :: {'message': 'Request returned status code 404, retrying in 2 seconds!', 'created_at': datetime.datetime(2024, 6, 14, 8, 48, 18, 243313), 'year': 2024, 'month': 6, 'day': 14, 'severity_level': 30, 'caller_name': 'log_warning', 'started_at': datetime.datetime(2024, 6, 14, 8, 48, 18, 243302), 'ended_at': datetime.datetime(2024, 6, 14, 8, 48, 18, 243303), 'duration': None, 'metadata': None, 'kwargs': None}
[2024-06-14, 08:48:20 UTC] {log.py:232} WARNING - [2024-06-14 08:48:20] - ConsoleHandler:ROOT - WARNING - EventMessage :: {'message': 'Request returned status code 404, retrying in 3 seconds!', 'created_at': datetime.datetime(2024, 6, 14, 8, 48, 20, 379829), 'year': 2024, 'month': 6, 'day': 14, 'severity_level': 30, 'caller_name': 'log_warning', 'started_at': datetime.datetime(2024, 6, 14, 8, 48, 20, 379818), 'ended_at': datetime.datetime(2024, 6, 14, 8, 48, 20, 379819), 'duration': None, 'metadata': None, 'kwargs': None}
[2024-06-14, 08:48:23 UTC] {log.py:232} WARNING - [2024-06-14 08:48:23] - ConsoleHandler:ROOT - WARNING - EventMessage :: {'message': 'Request returned status code 404, retrying in 4 seconds!', 'created_at': datetime.datetime(2024, 6, 14, 8, 48, 23, 519321), 'year': 2024, 'month': 6, 'day': 14, 'severity_level': 30, 'caller_name': 'log_warning', 'started_at': datetime.datetime(2024, 6, 14, 8, 48, 23, 519309), 'ended_at': datetime.datetime(2024, 6, 14, 8, 48, 23, 519310), 'duration': None, 'metadata': None, 'kwargs': None}
[2024-06-14, 08:48:27 UTC] {log.py:232} WARNING - [2024-06-14 08:48:27] - ConsoleHandler:ROOT - WARNING - EventMessage :: {'message': 'Request returned status code 404, retrying in 5 seconds!', 'created_at': datetime.datetime(2024, 6, 14, 8, 48, 27, 621110), 'year': 2024, 'month': 6, 'day': 14, 'severity_level': 30, 'caller_name': 'log_warning', 'started_at': datetime.datetime(2024, 6, 14, 8, 48, 27, 621099), 'ended_at': datetime.datetime(2024, 6, 14, 8, 48, 27, 621100), 'duration': None, 'metadata': None, 'kwargs': None}
[2024-06-14, 08:48:32 UTC] {log.py:232} WARNING - [2024-06-14 08:48:32] - ConsoleHandler:ROOT - WARNING - EventMessage :: {'message': 'Max retries exceeded, retried 5 times!', 'created_at': datetime.datetime(2024, 6, 14, 8, 48, 32, 743675), 'year': 2024, 'month': 6, 'day': 14, 'severity_level': 30, 'caller_name': 'log_warning', 'started_at': datetime.datetime(2024, 6, 14, 8, 48, 32, 743662), 'ended_at': datetime.datetime(2024, 6, 14, 8, 48, 32, 743663), 'duration': None, 'metadata': None, 'kwargs': None}
[2024-06-14, 08:48:32 UTC] {log.py:232} WARNING - [2024-06-14 08:48:32] - ConsoleHandler:ROOT - ERROR - EventMessage :: {'message': 'The TED-API call failed with: <Response [404]>', 'created_at': datetime.datetime(2024, 6, 14, 8, 48, 32, 771895), 'year': 2024, 'month': 6, 'day': 14, 'severity_level': 40, 'caller_name': 'log_error', 'started_at': datetime.datetime(2024, 6, 14, 8, 48, 32, 771888), 'ended_at': datetime.datetime(2024, 6, 14, 8, 48, 32, 771888), 'duration': None, 'metadata': None, 'kwargs': None}
[2024-06-14, 08:48:32 UTC] {log.py:232} WARNING - [2024-06-14 08:48:32] - ConsoleHandler:ROOT - ERROR - EventMessage :: {'message': 'No notices has been fetched!', 'created_at': datetime.datetime(2024, 6, 14, 8, 48, 32, 794402), 'year': 2024, 'month': 6, 'day': 14, 'severity_level': 40, 'caller_name': 'log_error', 'started_at': datetime.datetime(2024, 6, 14, 8, 48, 32, 794396), 'ended_at': datetime.datetime(2024, 6, 14, 8, 48, 32, 794396), 'duration': None, 'metadata': None, 'kwargs': None}
[2024-06-14, 08:48:32 UTC] {log.py:232} WARNING - [2024-06-14 08:48:32] - ConsoleHandler:DAG - DEBUG - TechnicalEventMessage :: {'message': 'fetch_notice_from_ted', 'created_at': datetime.datetime(2024, 6, 14, 8, 48, 16, 12420), 'year': 2024, 'month': 6, 'day': 14, 'severity_level': 10, 'caller_name': 'fetch_by_date_notice_from_ted', 'started_at': datetime.datetime(2024, 6, 14, 8, 48, 16, 558841), 'ended_at': datetime.datetime(2024, 6, 14, 8, 48, 32, 815447), 'duration': 16.256606, 'metadata': {'process_type': 'DAG', 'process_name': 'fetch_notices_by_date', 'process_id': None, 'process_context': None}, 'kwargs': None}
[2024-06-14, 08:48:32 UTC] {python.py:177} INFO - Done. Returned value was: None
[2024-06-14, 08:48:32 UTC] {taskinstance.py:1318} INFO - Marking task as SUCCESS. dag_id=fetch_notices_by_date, task_id=fetch_by_date_notice_from_ted, execution_date=20240614T084814, start_date=20240614T084816, end_date=20240614T084832
[2024-06-14, 08:48:32 UTC] {local_task_job.py:208} INFO - Task exited with return code 0
[2024-06-14, 08:48:32 UTC] {taskinstance.py:2578} INFO - 1 downstream tasks scheduled from follow-on schedule check

Process log

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant