-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow udf
in run_udf
to be array of strings
#374
Comments
Hmm, I'm not sure about this. We also experiment with use cases (in R) where we may need to pass multiple UDF files (i.e. multiple file paths or uris) and then we run into ambiguities as an array of strings could be anything and we don't have a good pattern anymore to distinguish them. That the file path is rarely used is IMHO not a good argument yet as user workspaces are simply not supported by back-ends yet and as such file-path can't be used although peoply may want to use it. |
Interesting. But if you want to pass multiple "files", you probably want to encode this as a mapping (filename -> contents) instead of just an array, because these files probably want to refer to each other (e.g. based on a filename).
The problem with file path or URL is that you add a dependency to an external resource that might change unannounced, disappear, cause permission issues ... . Having your UDF fully embedded within your process graph is a good thing if you want to make it completely self-contained. |
For file paths this is not required, you have the filename/path directly available and can load the content from it. It just gives you a bit more structure for complex UDFs instead of squashing everything into one file (and some languages are relatively strict what they allow in a file, think Java with one class per File or C++ with h and cpp files, if you think about the future).
Agreed for URLs, but I'd sincerly hope that this doesn't occur for my user workspace at the back-end. And I personally, would prefer to not have the UDF available as a file as for me it is easier to just up- and download files. But that's a matter of taste, indeed. |
I'd like to raise the importance of this feature request for version control friendly inline UDFs (e.g. as list of strings) I just introduced a bug in an internal project because I could not properly review the diff of the UDF code. |
openeo-processes/run_udf.json
Lines 30 to 51 in d0ce91f
In all practical use cases of
run_udf
I've seen, the udf code is provided as an inline string (not as URL or file path).Putting the UDF code as a single one-line string with newlines enced as
\n
is however horrible for version control (when the process graph is stored in a repository, which we often do).Feature request: allow the udf code also to be an array of strings (one string per line of code), which is a lot friendlier for version control tools. (Note that this is the JSON encoding approach used by jupyter notebooks)
The text was updated successfully, but these errors were encountered: