-
Notifications
You must be signed in to change notification settings - Fork 201
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Brainstorming on Glide 2.0 #170
Comments
@jasonvarga Hey Jay, I'd love to get your input on these changes. I know you've run into some of these exact issues with Statamic. |
@freekmurze I'd also love to get your eyes on this, since it will probably impact your Laravel Glide library as well. |
Loving the sound of this. 👍 Preventing bootstrapping of your entire app is a great idea and something we've struggled with. #152 was created to battle that. My only (super small) issue with the new filename structure is that some of our users love to be able to have some control over the actual filename. So if you try to save the image, you get the filename in it. In Glide v1 we'd allow that by having an additional URL segment that is simply ignored by Glide. Maybe you could include it similar to this.
That way, you at least see The URL generator and signature stuff is great. I don't think people would (or should) ever really just be editing URL params anyway. Finally, being able to manually pass params into Glide is something we use heavily. ( |
Like this! Being able to serve a cached image from the server/disk is something I really look forward too! To piggyback on @jasonvarga filename suggestion for downloading a URL structure like:
could "solve" that. You still group the images in source folder - the extension-less kayaks part, then the modifications and the filename. Another aspect of this is we could also use the modification part of the URL as the signature. Sign that. ie:
ideally a short hash could be used there.. something like http://hashids.org/php/ but I may be putting the cart before the horse here haha. That really may not be feasible nor secure ? |
Awesome, thanks for the feedback @jasonvarga. A couple more questions, if I may.
Honest question: what is the benefit for you to using Glide if you're not using the HTTP functionality? Do you simply prefer the API that it provides, over say using Intervention Image directly?
Interesting. I think we could probably do that. In fact, you could technically put whatever you wanted in the cached filenames. However, one other idea. What if we simply put set the source filename in the header, with the manipulation parameters, but without the signature? header('Content-Disposition: inline; filename="kayaks-w-800-h-500-fit-crop.jpg"'); The URL would still look like what I proposed in the browser, but if you saved it you'd get a nicer name. What do you think? |
In this case, the API is easier to use than Intervention. The pre-generated stuff is just an option if the user needs the performance boost. We do use the HTTP method, and we built in a URL generator into our template language essentially the same as the Blade example you mentioned.
Maybe I'm misunderstanding, but if the web server is going to be serving the images, doesn't that mean you won't have control over the headers anymore? |
@jasonvarga I obviously need more coffee. That is correct. ☕️ |
@cdowdy Thanks for the input! I wonder if we can combine some ideas here:
With this URL you get the signature, but it won't be part of the filename. You get the original source filename in there, plus the image manipulations. I don't think hashing the image manipulations work work as a replace for signatures, since the HTTP signatures really need to be a one-way encryption, where a hash would have to be decrypted. |
The other-other option is to put the manipulations in a folder and leave the filename as-is.
Although, maybe it's nice to have the manipulations in the filename. |
That's kind of what I was thinking. |
So, here is one more curve ball to all this. And this is really where the can of worms opens up. This whole system that we're discussing will work locally because of how web servers are able to fallback to PHP. However, this will not work for any other storage systems, like Amazon S3. If Glide 2.0 could come up with a way to solve those as well...then we'd really be winning. I've been thinking more about the URL generator, and how that could help. Here is an idea:
There are two problems here:
The second problem here isn't that big of a deal. If you're worried about the first user you can always warm up your cache by eager generating your images. The first problem is much more challenging. I'm not really sure how to overcome this one. The only idea I can think of is a local cache file that keeps track of all the images that have been cached. This could be a PHP file, or even saved in a database. It feels messy though. I'd like to keep thinking on this one. |
what I'm suggesting here would add complexity that may not be needed :) haha What if you were to save any image using the S3 adapter ( or any other non local one) to memory then on return visits provide the S3 url - either though x-accel/x-sendfile? The problem with the x-sendfile variants is you still need to make that call to the externally hosted images and I'm not so sure how that degrades in instances where the image still isn't uploaded to S3. Or a cookie/session/token needs to be set which .. yeah not always ideal Aside from your latest can-o-worms haha.. after thinking on it, as of right now I prefer your's and Jason's suggestion for the filename /modifications and how the url is structured. |
When you say to memory, what do you mean? Local cache file or database? While x-accel/x-sendfile are really cool, I think using these features is just too much for the Glide project. I'd like Glide to work without them, but still allow "power users" to take advantage of these kind of web server features if they want to. I just don't want them to be requirements. |
I'm wondering if Flysystem's caching layer can help here. In particular the "Persistent Caching". I think this does exactly what I was thinking. It looks like it supports multiple caching methods as well, including the local disk, Redis, Memcached and more. @frankdejonge, are you able to jump in here for a sec and let me know if this would work? Basically we want to be able to check if a cache file exists on Amazon S3 (or elsewhere) over and over. I'm worried about the performance of this, and was hoping caching could help. |
I think I know how I'm going to do this! 💡💡💡 First off, if you use local storage for caching, then the above technique will work out of the box. The web server will try find the image, and if it cannot it will fall back to PHP. That works perfect. Next, if you use a non-local storage for caching, that is fine, but by default your images are still served using Glide via PHP. This is exactly how Glide currently works. This is especially helpful in situations where you have multiple servers, since you'll want to host your Glide cache on something like Amazon S3, otherwise each server will need to process the image. However, if you don't want your images served via PHP you simply need to use a CDN or storage that supports fallbacks. In this setup you always request your images from the CDN, and in the event that the cached versions don't exist, they fall back to PHP to be generated. When they are generated they are saved to the CDN, so this will only happen once. I've actually got this working with Amazon S3, and it's pretty straightforward:
I honestly feel like this gives the best of all worlds. For those caching locally, it just works. For those looking to host the cache on a CDN or remote storage, there is a little more setup, but it's still quite doable. More information on Amazon S3 bucket routing can be found here. |
@reinink I read the email before and what you came up with was exactly what I wanted to suggest. Another alternative is to put varnish in front of glide, which is pretty much sort of a "role your own" solution for these kinds of situations. I actually use that on some projects and it works really well. Alternatively you could even use nginx's cache functionality for this, which is super fast and super easy to setup. Chances are you've already got nginx in your stack, in which case it's a quick win without the need of a big investment in time, infrastructure, and maintenance. |
TL;DR: since glide's use case is mostly on http, why not use http tooling? |
@frankdejonge Yes, totally. The thing is people use Glide because it has a really nice API, not necessarily because it's HTTP based. While that's certainly part of it, I'm trying to find a nice balance of sensible defaults, while still allowing for CDNs and other caching mechanisms like Nginx, Varnish, etc. |
@frankdejonge Or put differently, Glide is still a PHP tool. I want it to work in standard PHP configurations without needing any custom web server configuration. |
@reinink I get that. Then again, tackling scaling issues often requires a developer to not just use one tool. |
@frankdejonge Again, totally agree. However, I suspect most people using Glide aren't dealing with scaling issues. I want the library to work really well out-of-the-box for "typical" PHP projects while still being compatible with caching/CDNs layers when you need it. |
@reinink Loving the sound of all this. The S3 fallback url is very cool (didn't know that existed either). The latest version of laravel-glide doesn't support the entire feature set of Glide. When Glide 2.0 is released I'll be sure to create a new major version of laravel-glide that 'll support all features that Glide 2.0 will offer. Have fun creating Glide 2.0, I'll keep an eye on this thread. |
We use the Content-Disposition header to give our images a nicer filename for download. Example: We add this header:
We get all the caching benefits (we don't use the Glide cache at all, only the Nginx cache), but still have a nice filename if someone saves the image. We do not want image manipulations in the pathname. If, however, one chooses to use the Glide cache it's great to have all variations somewhere under the filename for easy cache invalidation. But isn't that orthogonal to passing of manipulation parameters? |
@emilv Yeah, with Glide 2.0 you'll totally still be able to run with this setup. The proposed changes above are simply for those who don't get this sophisticated with their setups. I'm curious, would you be willing to share how you did your nginx caching? |
We use fastcgi_cache:
Anything with Expires headers gets cached. (Careful not to include Set-Cookie headers on these responses or else they won't get cached; we use fastcgi_ignore_headers and fastcgi_hide_headers to remove any cookie headers on these responses) |
The other thing I want to review for Glide 2.0 is how the responses work. I have a couple issues with them: Issue 1: They are complicatedI like that Glide supports HttpFoundation, PSR-7 and CakePHP, but I find the configuration quite complicated. I worry that many users of Glide end up using the outputImage method instead of using proper response objects. This has led to issues of images not displaying properly. You have no idea how often people accidentally output whitespace or comments after outputting an image and things break. Using proper response objects that can be handled by a frameworks standard flow is much better and I'd like Glide 2.0 to make this easier to setup. Issue 2: Too many packagesI've tired to solve how difficult responses are to setup by creating support for common vendors. However, this has come with it's own pain. We now have six response factory packages that need to be maintained. We currently have these packages:
Each of these packages only include one file: a simple response factory. Slim and Zend both use PSR-7. Symfony and Laravel both use HttpFoundation. Cake has it's own request/response objects (although I believe they are moving toward PSR-7). Literally the only reason I have them split into separate repos is for testing. In order to test the Cake response factory I need to install the entire Cake framework. Same for Symfony, Slim, and the rest. Putting these all as dev dependencies of Glide just isn't practical. However, managing six packages isn't practical for me long term either. Goals:Moving forward, I'd like to:
I've got two ideas on how to do this: Solution 1: HttpFoundation & PSR-7The first option is to remove all the vendor specific implementations (and packages) and only offer the following two response factories right in the main package:
If one of those two doesn't work, you easily to add a custom response factory by implementing the To test this we would include Solution 2: HttpFoundation & PSR-7 BridgeThe second option is a bit more intense, but also simpler. We go back to an HttpFoundation only setup. This would require no response configuration, since Glide would always return an HttpFoundation response. However, understanding that PSR-7 is used by a number of popular frameworks, I still want to support this somehow. I think the the PSR-7 bridge is the perfect solution here. The only question would be whether we build in support for this bridge directly, or just explain to users how they can implement this themselves. |
Wanted to chime in here regarding serving generated image files from external caches. I've done a fair bit of work with this and the Drupal Flysystem module. The currently proposed solution for S3 with the fallback is pretty nifty, I didn't know that existed. The only problem I see is that it's S3 specific. I have no idea if other systems support that. There are 2 generic ways to handle this that I'm familiar with:
Option 1 is simple enough, but has scalability issues that we've run into with the Drupal. Options 2 is what we're converting to, but probably requires too much framework integration to do generically. You need some sort of background task, and a lock to avoid copying the file multiple times. Also, what about putting the signature in a GET param? It doesn't need to be in the filesystem, and would only be verified when PHP is doing generation. |
@twistor Hey! Thanks for participating in this discussion. Some thoughts/responses:
Agreed, this is a very specific S3 feature. How many other storage providers support this type of functionality, I simply don't know. However, the concept is a pretty common one with CDNs. You reference an image on your CDN and if it does not exist (a cache miss) it will fallback to the main server (Glide) to get the image.
Both of your solutions require this check. This was actually my original idea (see above), but I don't think it will work very well. On a page with a lot of images this could take up a ton of time, since each "exists check" requires an HTTP request. At that point it really begs the question if it's even worth showing the remote URL or just hitting PHP directly. Do you slow down the rendering of the HTML page just to get a speed boost on the loading of your images? That just seems wrong. I too thought Flysystem caching could work here, but I think this is going to cause all sorts of other issues. It's going to require extra configuration to use Glide. There could be issues with the cache getting too large on sites with thousands of images. How frequently does this cache get updated? How does the cache get updated when a new image is generated? And who knows how many other tricky things we could run into here. As far as I can tell right now, the way to get the best possible performance is to:
That said, there is absolutely no reason why you couldn't do a cached image "exists check" on your own project and output a different URL based on that check. I just don't think it's a good default for Glide.
That's an interesting question. I guess because I've been trying to avoid query strings in this release since I've heard that they can cause browser level caching issues (see this and this). Since we don't know if the cached file exists at the point of URL generation (as discussed above), we'd have to always output the signature, and so we'd always have a query string. But maybe that's okay? It would nicely simplify the URLs. Do you (or anyone else) know if the query string caching issues are real? |
Flysystem's cache will handle that avoiding extra network calls.
Yep, we wrote a separate cache adapter to support this. Flysystem's default of storing the whole cache in one object does create problems with large filesystems.
I think you're right though, more sophisticated setups would require application/framework level support. So, it makes sense to make the defaults simple to use.
I know that Drupal uses an |
@ADmad Awesome, that's what I was hoping for! 🍰🏄 |
It also has |
I didn't like that either so would be nice if the newer version doesn't hardcode expires headers or allows them to be configurable. |
Well, technically with Glide 1.0 you get a response object that you can modify to your hearts content. We simply set sensible defaults. Is there a reason why this approach doesn't work? |
Probably I could have modified the header, don't remember exactly why I chose to make my own response class instead. |
@ADmad Report back if you remember. I'm not opposed to adding more configuration options, but if this approach works just as well I'd probably prefer to leave it as is. |
This may or may not be useful to your conversation, but I'm actually using Glide 1.0 to store images almost exactly how you describe in your 2.0 proposal (Serve cached images via web server). If the file exists, apache/nginx returns it automatically, and if it doesn't exist then glide generates it for me. The way I handle parameters is by creating a configuration with a set of named "Image Styles". return [
'large' => [
'w' => 400,
'filt' => 'sepia',
],
]; The routes then look like I don't necessarily think this should be the default behavior but just wanted to provide an example of how a similar kind of server delivery/invalidation is working already. |
@reinink I think it can be very useful feature if Glide begin to zoom images without zooming their pixel .This method is using by uploadcare and I think it can be very useful.I don't know how exactly it works but It seems like they check if requested sizes is more then actual image size they use different approach and it look like below |
Discovered glide this morning. I'm impressed with what you've created so far! I noticed you considered iiif compliance back in 2015 here and I'm curious to know if that was still being considered. I work in the museum sector and iiif is thrown around a lot as the next big standard but there aren't really any good, flexible solutions currently (especially php based solution). I also wondered if you had any interest in adding support for generating zooming image formats like Image Pyramids or Deep Zoom Images. Again, good php implementations are few and far between. |
@daniel-keller I think IIIF compliance is totally doable still, I just haven't had the time. It's really just a URI spec from what I can tell. You could totally implement it in a specific application by simply creating routes that follow this spec and then passing those values to Glide manually (instead of following the default Glide URI format). As for Image Pyramids or Deep Zoom Images, I honestly don't even know what those things are. However, it's worth noting that Glide is really just a nice wrapper around the Image Intervention library. Depending on what these things are they may be better implemented at this lower level. |
How about using the php-vips library instead of Intervention Image? Because resizing an image with libvips is typically 4x-5x faster than using the quickest ImageMagick. There is an open issue at Intervention/image#615 to incorporate this high-performance backend into the Intervention Image library. So maybe it's not necessary to make many changes to Glide to benefit from this library. Disclaimer: |
First thanks for the amazing work. Love the library. |
Hey all, was reading up on this since I wanted to start using Glide to remove a few (3 mil+) WordPress thumb files and generate / serve them on the fly. I noticed there was no documentation for NGINX / Apache caching / direct serving and stumbled upon this conversation. I either read over it or missed it completely but you talked about maybe even making the signature mandatory, how about making the signature part of the solution for the caching. Since a signature only changes if you change the key or the parameters it seems like the perfect cache key. NGINX / Apache could extract the This was my initial thinking on solving this with Glide 1.0 but maybe I am overlooking something really big and that is why no one mentioned this before. |
(also relating to #83) With nginx, I think an elegant and effective way of serving images would be - as an alternative to storing rendered images in a public folder with all problems mentioned above - to use a double proxy and then use the proxy cache (not the fastcgi-cache)
this will result in:
I think this could only be accomplished by nginx, not apache. but if you're doing serious image serving nginx is preferred anyway (to me). I have done this setup (though not with glide as an image processing backend), if you're interested i can post a sample nginx config. |
@reinink Maybe too early to say, but have you had time to chew on this? How's 2.0 looking? |
Great thread and I like the sound of v2.0. I came here after looking at Croppa (requires Laravel) and LiipImagineBundle (requires Symfony) as I need something framework-agnostic. Something like Croppa but stand-alone would be my ideal, and the plans for v2 are very close to this. |
really liking your plans to optimize performance in 2.0.. I find the using Glide currently is so much slower than loading a static image, including when the cached image already exists.. and that's a big problem when do you foresee starting work on this? I see there hasn't been any commits to the 2.0 branch since a long time |
@vesper8 Fair question! I'm planning to get Glide 2.0 out in the new year! Probably will end up being February, some time after Laracon Online. I've been taking a brief hiatus from Glide to build Tailwind CSS. I'm looking forward to getting back into the 2.0 release though. |
I am one of those Statamic users who is dying for the ability to preserve filenames (see statamic/v2-hub/issues/1819). I also love Tailwind CSS so I appreciate the time you're spending there as well. Do you have an updated timeline for 2.0 at this point? Thank you SO MUCH for all the good work you're doing to make my experience building a modest bit of Internet so much easier and better for users. |
February and Laracon Online has come and gone but I'm still really looking forward to this new release.. especially if it's going to use better caching and won't need to hit the back-end and glide on every request which is currently killing my performance for my image-heavy website! An updated timeline would be awesome @reinink ! |
Sorry folks, this is still absolutely still on my radar, but I cannot give a timeline yet. Sorry! |
Thank you for the update! If it would be helpful to sponsor some of your time on this project I doubt I’m the only one willing to propose that to partners/bosses. |
Glide is awesome, I'm looking forward to use the V2 Thanks :) |
v2.0 pls :) |
A bit late, but I did some testing and think @frankdejonge is right. You can speed up the images with nginx pretty easily and is probably more flexible then rolling your own cache (eg. writing to disk), because you can let nginx take care of the validity / max cache size and possible dynamic WebP conversions. If anyone is interested, I did some benchmarks on the cheapest Digital Ocean droplet (1 vcpu, 1GB) and compared directly loading a file with nginx, loading it with Glide In Memory and normale file cache and Imgproxy (which doesn't have it's own cache). Finally also Imgproxy + nginx proxy_cache, but this should be the same for Glide as it doesn't touch Glide/Imgproxy for cached results.
So if I'm testing correctly, the overhead of running Glide with the current File Cache is pretty low (35 vs 37ms). At least on the minimal example I tested (with strict opcache, just Glide, no framework/router etc). Only the generation of 'fresh' files is pretty slow compared to Imgproxy (5x as slow). |
I've started thinking about Glide 2.0. The purpose of this issue is to share those ideas and get some discussion going about it.
Serve cached images via web server
My number one goal for Glide 2.0 is allowing cached images to be served by the web server, instead of Glide. Meaning the first request to a manipulated image would hit PHP (and Glide), but all subsequent requests will be handled by the web server automatically. Further, beyond the standard front-controller rewrite rule (routing all requests through index.php), I don't want there to be any extra web server configuration required.
The simplest way to accomplish this is to save the cached images in the same place, and with the exact same URL as the requested images. This way, when the first request comes in, the cached image won't exist, so the request will be routed to PHP and handled by Glide. However, all subsequent requests will hit the cached image directly, without hitting PHP. This should greatly improve the performance of loading cached images.
New URL format
This will not be possible with
GET
parameters since they are not treated as part of the filename. Further, the proper file format extension (jpg, gif, png) is also required so that the web server will output it properly. Finally, this will also require saving images in a public folder. Here is a proposed format that I've been testing with that works well:It's obviously not as nice as the current GET based URLs, but maybe that's okay. By grouping the cached images into source image folders it's easy to delete all the cached images for a specific image.
URL generation
I really feel like URL generation using a helper method is going to solve a lot of issues. Right now Glide ships with a URL generator, but it's sort of buried in the security documentation. I think this has been a bit of an issue with Glide 1.0. It makes it look like you can easily just edit URLs, but in reality if you sign your images (and you absolutely should be) you have to use some type of URL generator.
By making a URL generation helper more of a first-class feature, it will help minimize the pain caused by the less desirable URL format. Plus it will make HTTP signatures easier to work with. It will look something like this:
HTTP signature simplification
HTTP signatures would work in the same way as they currently do, except will also be part of the filename. I want to make HTTP signatures easier to work with in 2.0, since they are so important. Part of me is even tempted to require them. I want to make it possible to set the sign key right on the Glide object, and then let Glide check this value automatically when generating images, as opposed to right now where you must manually check the signature. This isn't a big step, but I think it's a hurdle for people new to Glide, and it's easy to skip setting them up.
Securing image URLs
As already noted, to have the cached images loaded by your web server you must place the cached images in a public folder. However, we could still keep the current functionality, where the cached images are loaded through PHP. This can be helpful in situations where you want to do authorization checks on the cached images before displaying them. To make this work you would simply make your cache folder non-public.
From my experience, this isn't a super likely requirement. Typically HTTP signatures are a sufficient enough security since these URLs are next to impossible to guess. Just something to consider.
Supporting the old functionality
Finally, I hope to still support the old functionality. This may look like this, although this is still open for discussion:
Many people use Glide without using GET params, so being able to manually pass in an array of manipulation params is also important.
The text was updated successfully, but these errors were encountered: