Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Example of a drake-powered Shiny app #4

Open
wlandau opened this issue Oct 10, 2018 · 8 comments
Open

Example of a drake-powered Shiny app #4

wlandau opened this issue Oct 10, 2018 · 8 comments
Labels
good first issue 👋 Good for newcomers help wanted Extra attention is needed

Comments

@wlandau
Copy link
Owner

wlandau commented Oct 10, 2018

cc @rich-iannone

@wlandau wlandau added the good first issue 👋 Good for newcomers label Jan 24, 2019
@wlandau
Copy link
Owner Author

wlandau commented Sep 30, 2019

We could use drake's HPC (make(parallelism = "clustermq")) and a non-blocking process to show progress() and/or vis_drake_graph() in the app UI.

@samirarman
Copy link

I thought I could use drake caching and ability to monitor ETag's to speed up app loading.

Not sure if shinyapps.io already caches the content.

The minimal example I thought can be found here.

I've published one version using the drake approach, and another reading the data directly.

Can't tell which one loads faster, though.

@wlandau
Copy link
Owner Author

wlandau commented Jun 22, 2020

That's so interesting. If you keep going with this, please let me know how it turns out. If you do find some benefit from using drake to manage remote data in Shiny apps, I would be happy to review a PR to this rep.

I see you use drake_gc() to keep the cache size low. One option to consider is to avoid storage altogether with cache <- storr::storr_environment(); make(cache = cache, history = FALSE). cache$gc() should be faster because the storr is totally in memory. Garbage collection takes a long time though, so I'm not sure how much speed gains you will actually see.

@wlandau wlandau added the help wanted Extra attention is needed label Jul 4, 2020
@wlandau
Copy link
Owner Author

wlandau commented Jul 4, 2020

On reflection, I do not actually think drake and shiny are intrinsically symbiotic. I think it there goals are independent, and whether you use both together just depends on the use case. I will leave it to the community to volunteer examples if available, and I would be happy to accept pull requests.

@samirarman
Copy link

In fact, I think so, as well, but I tend to see uses for drake all the time, as I'm a big fan of it 😅 .
I have a running app hosted in shyiniapps.io that transfers data from another source all the time (one of those millions covid dashboards available).

The problem is, at this time, shyniapps.io doesn't have a persistent storage option, so caching with drake in an elegant solution is not an option (RStudio says it will have persistent storage soon).

I have two options though:

  1. Storing drake cache elsewhere, which would lead to less data transfer and a messy code.
  2. Storing the app elsewhere, where persistent storage is available.

Anyways, I'll keep trying and I hope someone can come with a real production app that could be used as an example.

@wlandau
Copy link
Owner Author

wlandau commented Jul 4, 2020

Yeah, you point out a tricky point of friction. The situation is slightly better at my company where we use Shiny Server Pro, where we can host apps that have access to our cluster's file system. But this is not as common as shinyapps.io.

@jfunction
Copy link

When you say "drake-powered Shiny app" what exactly is meant by this? Having RShiny display data from drake targets? I have started to wonder about this myself. Let me explain my journey.

I made an RShiny app which pulls in data from various sources. Some of the data should be some sort of merging of raw data which is filtered/modified before being fed into plots/tables while some of the data is more for display on the front end (eg, colours to fill certain plots in given what is being plotted, parameters for selectInput elements and so on). I initially had a loaddata.R doing all of the work but having source("loaddata.R") in my app.R isn't very explicit as to which variables are being sourced in/needed.

So I renamed it makeData.R and reprogrammed it to make the data and write out .RDS files with only the data needed by my app. Then I'd explicitly pull these into my app with something like data1 <- readRDS("someData.RDS"). I usually also have a small comment describing the data next to where I pull it in. At this point my makeData.R script didn't even need to be deployed! Great! I also wrote a manifest and didn't include the raw data which also kept the deployment size small.

The script became quite large and clunky though, and it lacked transparency, so I decided it should really look more like a drake plan. I then made a separate repository for handling all the data related things and producing data artifacts (RDS files) for inclusion in my app. This is now quite general, and some of the data can go into other apps or directly to other stakeholders. Another benefit is that this keeps my shiny app repository clean because it only contains the data I need for the app - no more and no less. In my case the data doesn't need to be modified during the course of a deployment, so this works for me.

One con however is around the data targets which are more specific to my app. For example if my drake plan makes a (data-dependent) tibble of timeseries data with colours to fill plots in with, and I want to change the colour of some plot, I have to rerun my drake plan from another repository and move the RDS exports across. So it seems there is still too much coupling going on. As such, I'm now thinking of separating out the kinds of data being generated so that the concern of my data repository is not directly tied to the concerns of my app. That way the app source code should be sufficient to give one a good idea of the app functionality.

Anyway, I hope that story is useful to some, and I welcome feedback. Oh and thanks @wlandau for drake 😄

@wlandau
Copy link
Owner Author

wlandau commented Sep 14, 2020

Thanks for the interesting use case.

There are probably all sorts of ways drake and shiny work can together, it all depends on the goals and context of the project. Here are the ones I have seen personally, the second and third of which seems similar to yours:

  1. A web interface on top of a full end-to-end data analysis project, ideally powered by a cluster. This can be helpful if you want others to be able to run a computationally intense project without having to know R. A colleague actually built one at work and shared it with external collaborators. The idea was to encapsulate a simulation study that could accept arbitrary user-defined data-generating scenarios. drake helped by skipping scenarios that the user did not change since the last run.
  2. Maintain an external drake pipeline that outputs a data file, then upload this file to a completely separate shiny app with an upload button. There is some work involved in documenting the data standard so others know how to prepare data for upload, but it is far less work than (1), and I have used it successfully at my day job.
  3. Write a drake pipeline that creates the app data and has a final drake target to deploy the app code and the data to RStudio Connect or shinyapps.io. There is an R-Bloggers post about that here. I use drake-powered continuous deployment all the time for R Markdown reports and xaringan slide decks at work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue 👋 Good for newcomers help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

3 participants