Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support non-JSON uploads #28

Closed
jess28 opened this issue Mar 30, 2016 · 16 comments
Closed

Support non-JSON uploads #28

jess28 opened this issue Mar 30, 2016 · 16 comments

Comments

@jess28
Copy link

jess28 commented Mar 30, 2016

I've been trying to use googleAuthR for a shiny app that uploads user files to a universal Google Storage account. I'm having trouble figuring out how to set up the body of the request so that the actual file data gets sent to the API rather than just the json metadata. All that happens right now is that my file in Google Storage is just a json file with the name and type of the file I want to upload.

I tried using httr's upload_file() to create the upload data and then pass that to the body, but I'm not very versed in httr or curl, so I'm not sure how else to do this. I think my issue is that the body of the request gets parsed into json, when I don't want to parse the actual file data.

I'm just trying to use a simple upload at this point (https://cloud.google.com/storage/docs/json_api/v1/how-tos/upload) and I just can't figure it out!

Is there a specific way to create a file data stream or something for use with googleAuthR? I really appreciate the easy authorization with shiny this package provides and I'd rather not have to deal with straight httr!

Thanks,

Jess

@MarkEdmondson1234
Copy link
Owner

Hi Jess,

I've just recently used googleAuthR to work with GCS myself as part of the bigQueryR package, so think I can help here.

Check out the uploadData.R file from there which has very similar syntax to what I see at the GCS help https://github.com/MarkEdmondson1234/bigQueryR/blob/master/R/uploadData.R which in turn borrows from bigrquery.

The pertinent bit is adding the csv file as raw output into the body:

## standard_csv() is a function in same file that writes the text
csv <- standard_csv(upload_data)

  boundary <- "--bqr_upload"
  line_break <- "\r\n"

  mp_body_schema <- paste(boundary,
                          "Content-Type: application/json; charset=UTF-8",
                          line_break,
                          jsonlite::toJSON(config, pretty=TRUE, auto_unbox = TRUE),
                          line_break,
                          sep = "\r\n")

  ## its very fussy about whitespace
  ## must match exactly https://cloud.google.com/bigquery/loading-data-post-request 
  mp_body_data <- paste0(boundary,
                         line_break,
                         "Content-Type: application/octet-stream",
                         line_break,
                         line_break,
                         csv)
  mp_body <- paste(mp_body_schema, mp_body_data, paste0(boundary, "--"), sep = "\r\n")

  l <- 
    googleAuthR::gar_api_generator("https://www.googleapis.com/upload/bigquery/v2",
                                   "POST",
                                   path_args = list(projects = projectId,
                                                    jobs = ""),
                                   pars_args = list(uploadType="multipart"),
                                   customConfig = list(
                                     httr::add_headers("Content-Type" = "multipart/related; boundary=bqr_upload"),
                                     httr::add_headers("Content-Length" = nchar(mp_body, type = "bytes"))
                                     )
                                   )

  req <- l(path_arguments = list(projects = projectId, 
                          datasets = datasetId,
                          tableId = tableId),
    the_body = mp_body)

I'm sure modifying this a bit will give you what you need. I'm also planning to look at a Google Storage API at some point so am interested in your progress! From the same package I've needed an extract data and assign ownership to a Google email

@jess28
Copy link
Author

jess28 commented Mar 31, 2016

I'm trying to upload image files, so I encoded the file with the base64enc package and used that as the body of the request. However, now I'm getting an error that I'm having trouble tracing.

Error in curl::curl_fetch_memory(url, handle = handle) : Failed sending data to the peer Warning: Error in $: $ operator is invalid for atomic vectors Stack trace (innermost first): 72: match 71: %in% 70: retryRequest 69: doHttrRequest 68: f 67: f [E:\Defenders_JWM\R\EndSp_images_dash/global.R#87] 66: with_shiny 65: uploadImg [E:\Defenders_JWM\R\EndSp_images_dash/server.R#234] 64: saveData [E:\Defenders_JWM\R\EndSp_images_dash/server.R#222] 63: observeEventHandler [E:\Defenders_JWM\R\EndSp_images_dash/server.R#183] 1: runApp

Looking in my google storage account, I see the upload is actually there with the correct size, but the image does not show up at all. I have no idea what is happening. I cannot figure out where the Error in $ is coming from, since it looks like this is getting all the way to the retry section of code and testing for a 200 or 201 response and then failing somewhere in curl. Have you ever had a problem like this?

Thank you for pointing me to your code for BigQuery, it just ended up not being what I needed since I really only need the base64 encoded data in the body.

@MarkEdmondson1234
Copy link
Owner

Ahh ok its not a simple data.frame :)

I'm afraid I haven't uploaded binary/jpeg files before in R.

It looks like the error is related to the HTTP request failing, in curl that httr uses that googleAuthR uses - the retryRequest is what is called when a request fails and its checking if it should try again (it shouldn't in this case).

Having a look in httr issues, this looks close to what you are asking: r-lib/httr#253 and a solution here:
https://gist.github.com/jeroenooms/a68cd7908e1dbea89535

...if true, then the body for googleAuthR would need to be something like:

 body = list(
    media = httr::upload_file(media, type = "image/png")
  )

Let me know how it goes.

@jess28
Copy link
Author

jess28 commented Mar 31, 2016

Yes I did try this! The issue with this solution when it comes to googleAuthR is that the body is automatically turned into json. When I try to use httr::upload_file() the json parsing errors out with Warning: Error in : No method asJSON S3 class: form_file

Is it possible to prevent the json parsing? Or is this something I just need to do in regular httr?

@MarkEdmondson1234 MarkEdmondson1234 changed the title Using POST to upload files to Google Storage Support non-JSON uploads Mar 31, 2016
@MarkEdmondson1234
Copy link
Owner

Right, yes it assumes all requests are json. Try using the customConfig parameter, to override the encode parameter, something like:

upload_jpg <- gar_api_generator("the.url.google.storage.com", "POST", customConfig = list(encode = "form"))

Trouble is I don't know if that will override the existing json one. If it doesn't work, construct it in httr and show me here and I'll look to add support for it in googleAuthR

@jess28
Copy link
Author

jess28 commented Mar 31, 2016

Yeah, that didn't work to override it. I'll work out something in httr and let you know what ends up working.

@jess28
Copy link
Author

jess28 commented Apr 1, 2016

Alright, I got it to work in shiny! Here is basically what I did. I think all you need to do for googleAuthR to work with this is allow the user to choose if the body should be parsed into json or not. Doing it automatically means that the user cannot use httr::upload_file().

## in global.R
library(shiny)
library(httr)
library(jsonlite)

scope <- "myScopes"
secrets <- jsonlite::fromJSON("path/to/my/secret.json")
endpoint <- oauth_endpoints('google')

service_token <- oauth_service_token(endpoint, secrets, scope)

## in server.R
shinyServer(function(input, output, session){

    the_body <- reactive({upload_file("path/to/file", type = "mime type")})
    the_url <- "upload/url" #can use sprintf to add in pars params

    eventReactive(input$submit, {
        POST(the_url,
                  config(token = service_token),
                  body = the_body(),
                  add_headers("Content-Type" = "mime/type")
        )
   })
})

## in ui.R
    shinyUI(
          fluidPage(
          fileInput("picture", "picture"),
          actionButton("submit", "submit")
  )
)

As you can see, I'm using a service token rather than user authentication, but I can't imagine it's much different either way. I think my only problem was that since I was trying to upload pictures, the automatic json parsing of the body was not the right answer for me.

Thanks for your help!

@MarkEdmondson1234
Copy link
Owner

Great job - ok try the latest version on Github: it will now allow customConfig to change all the parameters being sent to httr, including encode.

As you say, the reason you still may want to use googleAuthR even though you have it done in httr is the Shiny multi-user authentication, as the token is taken care of with with_shiny() so a valuable addition - thanks for bringng it to attention.

MarkEdmondson1234 added a commit that referenced this issue Apr 1, 2016
@jess28
Copy link
Author

jess28 commented Apr 1, 2016

Thank you very much, this works great! I'm going to close the issue, it seems solved to me!

@jess28 jess28 closed this as completed Apr 1, 2016
@MarkEdmondson1234
Copy link
Owner

Hmm actually this fix broke a test for me, so I think I will need to revert - BUT when I tried my own upload of pictures it worked fine with the old version. Here is my code, could you compare to yours?

#' Upload a file or arbitary type
#'
#' Requires scopes https://www.googleapis.com/auth/devstorage.read_write
#'   or https://www.googleapis.com/auth/devstorage.full_control
#'
#' @param file filepath to what you are uploading
#' @param bucket bucketname you are uploading to
#' @param name What to call the file once uploaded
#'
#' @export
gcs_upload <- function(file, bucket, name = NULL){


  ## simple upload <5MB
  up <-
    googleAuthR::gar_api_generator("https://www.googleapis.com/upload/storage/v1",
                                    "POST",
                                   path_args = list(b = "myBucket",
                                                    o = ""),
                                   pars_args = list(uploadType="media",
                                                    name="myObject"))

  req <- up(path_arguments = list(b = bucket),
            pars_arguments = list(name = name),
            the_body = httr::upload_file(file))

  req$content
}

MarkEdmondson1234 added a commit that referenced this issue Apr 1, 2016
@jess28
Copy link
Author

jess28 commented Apr 2, 2016

This code is what worked with the original version? Hmm, that is very strange, it looks pretty much like what I was trying to do. Did you try using this in shiny or just as is?

@MarkEdmondson1234
Copy link
Owner

Once it works offline it should work in Shiny, that's a big aim of the library.

Here is a working version with Shiny using the gcs_upload() function above, which I have now put in the start of a GCS library here: https://github.com/MarkEdmondson1234/googleCloudStorageR

library(shiny)
library(googleAuthR)
library(googleCloudStorageR)
options(googleAuthR.scopes.selected = "https://www.googleapis.com/auth/devstorage.full_control")

## you need to start Shiny app on port 1221
## as thats what the default googleAuthR project expects for OAuth2 authentication

## options(shiny.port = 1221)
## print(source('shiny_test.R')$value) or push the "Run App" button in RStudio

shinyApp(
  ui = shinyUI(
      fluidPage(
        googleAuthR::loginOutput("login"),
        fileInput("picture", "picture"),
        textInput("filename", label = "Name on Google Cloud Storage",value = "myObject"),
        actionButton("submit", "submit"),
        textOutput("meta_file")
      )
  ),
  server = shinyServer(function(input, output, session){

    access_token <- reactiveAccessToken(session)

    output$login <- googleAuthR::renderLogin(session, access_token())

    meta <- eventReactive(input$submit, {

      message("Uploading to Google Cloud Storage")
      with_shiny(gcs_upload,  # from googleCloudStorageR
                 file = input$picture$datapath,
                 bucket = "gogauth-test",  # enter your bucket name here
                 type = input$picture$type,
                 name = input$filename,
                 shiny_access_token = access_token())

    })

    output$meta_file <- renderText({
      validate(
        need(meta(), "Upload file")
      )

      str(meta())

      paste("Uploaded: ", meta()$name)

    })

  })
)

@jess28
Copy link
Author

jess28 commented Apr 4, 2016

I just tried that script with everything installed and got the same error I have been getting with just using googleAuthR: Error in : No method asJSON S3 class: form_file. I think jsonlite is getting confused for me with the output from httr::upload_file(). Might this be a version issue? I can't imagine why, but I am using jsonlite_0.9.19 and httr_1.1.0.

I'm also using googleAuthR from the CRAN download, should I be using the package straight from github? Are there changes that haven't made their way to CRAN yet?

@MarkEdmondson1234
Copy link
Owner

Use the github version of googleAuthR v 0.2.0.9000

I tracked down the error to just some feedback messages that are trying to read JSON when there is not.

This line corrects it, by only showing the message if its json encoding:

if(!is.null(the_body) && arg_list$encode == "json"){
+    myMessage("Body JSON parsed to: ", jsonlite::toJSON(the_body, auto_unbox=T), level = 2)

I'll patch it and try to get on CRAN asap. Thanks for helping find it!

@jess28
Copy link
Author

jess28 commented Apr 5, 2016

That worked, thank you! I look forward to using this from CRAN.

@MarkEdmondson1234
Copy link
Owner

I ran into this problem again myself, so who knows what's happened before, but the "encode" is now passed transparently to httr

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants