-
Notifications
You must be signed in to change notification settings - Fork 14.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add basic Airflow error guide #44616
base: main
Are you sure you want to change the base?
Conversation
List of core Airflow exceptions from AirflowException
AirflowBadRequest
AirflowNotFoundException
DagNotFound
DagCodeNotFound
DagRunNotFound
AirflowConfigException
AirflowSensorTimeout
AirflowRescheduleException
InvalidStatsNameException
AirflowTaskTimeout
AirflowTaskTerminated
AirflowWebServerTimeout
AirflowSkipException
AirflowFailException
AirflowOptionalProviderFeatureException
AirflowInternalRuntimeError
XComNotFound
UnmappableOperator
XComForMappingNotPushed
UnmappableXComTypePushed
UnmappableXComLengthPushed
AirflowDagCycleException
AirflowDagDuplicatedIdException
AirflowClusterPolicyViolation
AirflowClusterPolicySkipDag
AirflowClusterPolicyError
AirflowTimetableInvalid
AirflowFileParseException
FileSyntaxError
ConnectionNotUnique
TaskDeferred
TaskDeferralError
PodMutationHookException
PodReconciliationError
RemovedInAirflow3Warning
AirflowProviderDeprecationWarning
DeserializingResultError
UnknownExecutorException |
0af1de4
to
4fc373a
Compare
One comment here @omkar-foss. This is quite some change in how we treat errors, so it would be great to announce intention to implement those error numbers and messages at the devlist. While there was survey and few people discussed that this is a good idea, "What did not happen on devlist, did not happen" - so likely start a discussion on devlist - with intention to run lazy consensus / (or vote in case there will be any doubts). |
Done, sent on devlist ✅ Apologies for the delayed response! I'll continue adding error mappings to this PR while we await responses on devlist and finalize items etc. |
7c250e8
to
67520d8
Compare
015ae51
to
86fc8b1
Compare
In accordance with @ashb's feedback on this slack thread to include errors relevant to end users, I've updated the Airflow Error Codes list in this PR with top 100 user-facing errors with their descriptions and newly assigned (tentative) error codes. I've created this top 100 errors list by referring to Airflow-related questions on StackOverflow, suggestions from ChatGPT and also by referring to few questions asked on #user-troubleshooting slack channel. There's a lot of scope for improving this list so would be great if you can check it out and drop a comment on this PR as necessary. Thank you :) Markdown-rendered view here: https://github.com/apache/airflow/blob/86fc8b10bd248e41aba2d80de76bac04280e2c03/dev/AIRFLOW_ERROR_GUIDE.md |
As discussed in slack - value of that list and the page is going to be WAY better if there is an action that the user can make for all of those. Users often do not look for description of what is going on, they are looking after the solutions. And in a number of cases we can at the very least guide them where to look for such solutions, which part ofthe documentation should they look for (i.e. link to relevant documentation) . In some other cases we can suspect that this is a deployment issue and tell the users to look there, In many other cases we can even point them to actual configuration parameters that could be changed, or typical resolutions and aras they should look for. In many other cases you can add some examples what could be done. The ruff rules for one are very good way of approaching it lilke https://docs.astral.sh/ruff/rules/#legend - many of those rules explain what happen, and a number of thos provide a proposal for a solution/example of fixes. While it's a bit "easier" with ruff, as the rules are simpler than potential Airflow errors, I see no reason why we should not be able to at least guide the people to the solutions. That might significantly decrease the number of issues people will open in our repo, and even if not - it will make it easier for all contributors and committers and triage team to be able to respond to such issue and direct the users to those pages, providing first liine of support for our users. All those do not haave to be there in the PR to get it merged, but IMHO we should design it in the way that it is possible - and "crowdsource" filling that information (via an issue where we will have)
And let the community people contribute the possible solutions and things to look at there. Possibly table like that is a bit to "small" to keep that information. |
related: #43171
This PR introduces a very basic guide with Airflow error codes mapping to some common errors. This can be treated as a starting point for the mapping to which we can all start adding to and improving the coverage of errors and their mapping to possible causes and resolutions.