-
Notifications
You must be signed in to change notification settings - Fork 261
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add serializer and parser for excel #973
Comments
@schloerke looking for simple guidance/preferences, there are two questions below. The default behavior of I suggest we can support the following annotations: ### all of the following result in the same thing
#* @parser xlsx
#* @parser xlsx list(sheet=NULL)
#* @parser xlsx list(sheet=1)
### extension
#* @parser xlsx list(sheet=NA) The latter call is intercepted by the parser. It'll wrap internally, returning a named list of frames (names are the worksheet names). As it stands, internally I'm planning if (anyNA(sheet)) sheet <- readxl::excel_names(tmpfile)
lapply(sheet, function(sht) readxl::read_xlsx(path = tmpfile, sheet = sht, ...)) This is simple enough and it works. However, without guardrails, the user can use #* @parser xlsx list(sheet=c(1,2,5)) which will still iterate and fail if there are fewer than 5 sheets. Clearly I can guard against this failure as well. While I have no personal use-case where this would be useful, I also don't feel it is necessary to force against this utility, as it just adds code, and the error messages tend to be clear/unambiguous. Question 1: do you care enough about this that we should actively prohibit this behavior, or do we allow the user to put themselves in this position? Said differently, do you feel strongly enough that we need to enforce Now the simple point of what to return ... options:
Question 2: Do you prefer "always list of frames"? I personally feel that while option 2 does present a polymorphic parser (of sorts) -- discouraged by some people -- the return of anything other than a simple frame requires explicit action by the user, so it should be unambiguous. |
If a file has a single sheet... should it return a This being the case, I'm up for a single method that always returns a list of data frames as an excel file always has sheets that contain data. ( (Is it possible to use
Is it possible to wrap the reading of the sheet in a # untested
ret <- lapply(sheet, function(sht) {
tryCatch({
readxl::read_excel(path = tmpfile, sheet = sht, ...)
}, error = function(e) {
res$status <- 422L
rlang::abort(
paste0("Error reading sheet: ", sht, "),
parent = e
)
}
}) This way plumber isn't making any assumptions on what Related: Sometimes I've seen an attempt to gracefully handle and item's error processing by setting the item to I don't think we should set |
I think it is paramount to have type stability in the parsers so I support the idea of always returning a list even in the case of the xlsx file only containing a single sheet |
I prefer the notion of "always a list of frames" (agree with @thomasp85 on "type-stability", that was the term I needed instead of polymorph). I prefer a single parser overloading the Having said that, to address your points @schloerke :
Are you good with |
To add some "inconsistency", however ... it's |
btw, the tests in |
See #959 (comment)
The text was updated successfully, but these errors were encountered: