Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Images only saved some of the time #285

Open
crocodisle opened this issue Dec 31, 2024 · 2 comments
Open

[Bug]: Images only saved some of the time #285

crocodisle opened this issue Dec 31, 2024 · 2 comments
Labels
bug Something isn't working

Comments

@crocodisle
Copy link

ArchiveWeb.page Version

v0.14.1

What did you expect to happen? What happened instead?

I encountered this with a Canvas instance. I was using the archiveweb.page windows application to archive multiple web pages in a canvas course. I was archiving multiple pages by clicking the links between them and letting the URLs resolve before going to the next one. However, when I replayed a page, I noticed some pictures weren't loading. On certain pages only some of the images would load. Images would have a "broken link" icon where they were located in the page. Exporting the archive as WACZ and viewing with the replayweb.page website and windows application would reproduce the issue. Re-archiving the page on the archive created a new snapshot of it, but the issue persisted.

I downloaded the vivaldi browser and installed the extension on that, and only archived one of the specific pages I was having an issue with. After replaying it, the page appeared as expected with all images displaying, but only once, before breaking again. Exporting the WACZ made with the extension on Vivaldi and viewing it on the replayweb.page website and windows application displayed similar behavior, but displayed the images more reliably overall.

Not sure what changed, I went back to the archiveweb.page windows application and exported only one of the pages in question, just like how I did with the vivaldi browser, and lo and behold it worked. Then, I returned to the initial archive I was making of the course with multiple pages. I took another snapshot of the page, replayed it, and now it was fine, strangely. Exporting the WACZ and viewing in other viewers had the page working as expected with all images visible.

Some additional details:

  • When I was archiving the page, all images were always visible and they looked as expected. It was only when the page was replayed or exported that some images broke. Icons seemed fine, it was only images added in as content. Their URL was a different server.
  • Whenever I begin archiving with the windows application, it gives me an error. This does not show up with the extension on Vivaldi:

image

  • The issue happened with both the Vivaldi extension and the windows application, even though it worked at first with the Vivaldi extension before breaking
  • All previous snapshots of the broken page in the original archive of the course had their images fixed as soon as the last working snapshot with those images was taken. This is good, but it took 6 snapshots before I was able to capture one that had the images loading correctly, and it was only after I started archiving with the broken page's link directly within the app. To do this for all of the broken pages would be quite time consuming...

In summary, only some pages have all of their images working in replays while others don't, and the only way to get them to work is to re-start archiving with a direct link to the broken page(s). The expected behavior is that the images would be added to the archive the first time.

I would share the archives I was experiencing the issue with, but the pages require login credentials to access so I would rather not risk it.

Step-by-step reproduction instructions

  1. Use the windows version of archiveweb.page.
  2. Begin archiving pages from a canvas course
  3. After following links between multiple pages, replay pages with embedded images, and some embedded images may appear as broken links even though they are fine in the browser. Images may work initially but break with subsequent replays or viewing of the archive. Exporting as WACZ or otherwise and viewing on replayweb.page apps does not fix it.
  4. Leaving the session and beginning a new archiving session with the link to the page with broken images may fix broken images.

Additional details

No response

@crocodisle crocodisle added the bug Something isn't working label Dec 31, 2024
@ikreymer
Copy link
Member

Do you have a WACZ file you can share? Otherwise, it's very hard for us to repro.
Some sites load different images at different resolutions, and dynamically swap them out (not using using img srcset, so this could be an example of that.

@crocodisle
Copy link
Author

crocodisle commented Dec 31, 2024

@ikreymer Would changing the size of the window effect whether media loads in or not if this is the case? I tried with a broken page and it didn't change anything. I found that archiveweb.page is saving the images, but they're just not displaying.

Working Live Page:
image

Broken Archived Page:
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants