-
Notifications
You must be signed in to change notification settings - Fork 94
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GracefulErrorAdapter fails with Parallelizable #1009
Comments
Ahh, yes, good point. Thanks! That's a reasonable workaround -- would you mind sharing the code for it? Otherwise we can take a look to see how we might cascade it forwards with these expectations. That said, it's a little iffy on requirements. What would you expect to happen (I think I know which one I'd like...):
(2) is not crazy, but (1) is a good UX IMO. And I think might be easy to implement... |
I modified the adapter to have: def run_to_execute_node(
self, *, node_callable: Any, node_kwargs: Dict[str, Any], node_tags: dict[str, Any], **future_kwargs: Any
) -> Any:
"""Executes a node. If the node fails, returns the sentinel value."""
for key, value in node_kwargs.items():
if value == self.sentinel_value: # == versus is
return self.sentinel_value # cascade it through
try:
return node_callable(**node_kwargs)
except self.error_to_catch:
if node_tags.get("for_graceful", "") == "parallel":
return [self.sentinel_value]
return self.sentinel_value And had: from hamilton.function_modifiers import tag
@tag(for_graceful="parallel")
def distro(n: int, allow_fail: bool) -> Parallelizable[int]:
for x in range(n):
if x > 4 and allow_fail:
raise Exception("bad")
yield x * 3 I would prefer (1). If the function fails before the yield/return portion, then I'd expect (2). Or for it to act as if it gets one run through the parallelizable block with the sentinel. As long as the collect block got some kind of list of sentinels/actual results. I'm working on an expansion of the adapter that'll grab traceback and other metadata and pass that forward as a specific object class (as the sentinel) so I know everything will make it to the end no matter what (and I can introspect and write my logs, etc. based on that). Because of that, anything that fails safely forward and passes the sentinel down is preferred. |
Ok, great, would love it if you contributed that back! Yes, agreed on (1), and your edge-cases. Easy enough implementation (might have to do a little more surgery, but shouldn't be too hard):
Note that with (1) it could also fail for every step after any errors (in the generator), assuming that the first error cuts out control flow. IMO that's cleaner -- you don't want to be running code after an exception. So the example from (1) would be The ideal solution would be to check for the sentinel type inside the framework but I think that's a bit challenging and doesn't change too much. Re: implementation -- you seem to be building something exciting -- do you want to contribute back with these changes? I've uploaded a PR that provides the deeper framework-level changes you'll need -- figure that will help you get started. If you're going to build the adapter for your own use-case we can probably find a way to generalize it once you're happy together. |
Thank you very much for the PR upload, can't wait to dig in. I'll definitely be putting a PR when I have it moderately functional!. |
You got it! I don't doubt you'll be able to figure this out but feel free to reach out if you want help navigating -- slack is probably the easiest to reach us. |
Closing since PR was merged |
GracefulErrorAdapter
does not know to return a list of the sentinel value forParallelizable
nodes (ones that are of typeEXPAND
).Current behavior
Stack Traces
The error can be reproduced with the code in the steps to replicate.
Steps to replicate behavior
Toggling
allow_fail
shows that the GracefulError works nicely within a parallel block, but not for the entry point.Library & System Information
Windows 10 Pro
Python 3.12.3
Hamilton 1.69.0 (7fd5e16)
Expected behavior
The node should fail safely and press on.
Possible Solution
The cause is straightforward:
hamilton
expects an iterable back from anEXPAND
node (really from anEXPAND_UNORDERED
group, I think). TheGracefulErrorAdapter
doesn't know to do that currently.The solution I was attempting to pursue was to inject more node data into the
run_to_execute_node
method of the adapter. However, the base class doesn't allow an override ofdo_node_execute
such that I could pass that information down into the adapter (such as fromnode_
).My first workaround was to modify the adapter to look for the
"expand-"
at the start of thetask_id
value that's passed in, but that catches all nodes within the parallelizeable block, not just the first one. Currently I am tagging the parallelizable nodes and looking for the tag in thedo_node_execute
to return[self.sentinel_value]
, which avoid the error.The text was updated successfully, but these errors were encountered: