Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Images missing from medium.com articles #299

Open
designakt opened this issue Jul 14, 2016 · 13 comments
Open

Images missing from medium.com articles #299

designakt opened this issue Jul 14, 2016 · 13 comments

Comments

@designakt
Copy link

Any images shown in medium.com articles are lost when read via Reader View.
examples:
https://medium.com/firefox-ux/this-is-our-reading-list-c81c4238dd3d
https://medium.com/the-year-of-the-looking-glass/quality-is-not-a-tradeoff-bcddf7c85553
https://medium.com/@joulee/metrics-versus-experience-a9347d6b80b

readerviewimagesmedium

@tigt
Copy link
Contributor

tigt commented Jul 28, 2016

Medium does this awful lazy-loading strategy, where their image markup looks like this (whitespace added for readability):

<figure class="graf--figure graf--leading">
  <div class="aspectRatioPlaceholder is-locked" style="max-width: 699px; max-height: 392px;">
    <div class="aspectRatioPlaceholder-fill" style="padding-bottom: 56.10000000000001%;"></div>
    <div class="progressiveMedia js-progressiveMedia graf-image" data-image-id="0*9lG_11ELAmuIaO-Z.jpeg" data-width="699" data-height="392">
      <img src="https://cdn-images-1.medium.com/freeze/max/30/0*9lG_11ELAmuIaO-Z.jpeg?q=20" crossorigin="anonymous" class="progressiveMedia-thumbnail js-progressiveMedia-thumbnail">
      <canvas class="progressiveMedia-canvas js-progressiveMedia-canvas"></canvas>
      <img class="progressiveMedia-image js-progressiveMedia-image" data-src="https://cdn-images-1.medium.com/max/800/0*9lG_11ELAmuIaO-Z.jpeg">
      <noscript class="js-progressiveMedia-inner">
        <img class="progressiveMedia-noscript js-progressiveMedia-inner" src="https://cdn-images-1.medium.com/max/800/0*9lG_11ELAmuIaO-Z.jpeg">
      </noscript>
    </div>
  </div>
  <figcaption class="imageCaption">From the movie “Jiro Dreams of Sushi”</figcaption>
</figure>

Since the only real <img src> in there is a 20-pixel-wide file, or wrapped in a <noscript>, this might take some doing.

@designakt
Copy link
Author

I wonder how Firefox on Android solves this.
They show the picture in their Reader View:

screenshot_2016-08-02-15-37-59

@tigt
Copy link
Contributor

tigt commented Aug 2, 2016

Just checked; they're sending different HTML for that User-Agent:

<figure class="graf--figure graf--leading">
  <div class="aspectRatioPlaceholder is-locked" style="max-width: 699px; max-height: 392px;">
    <div class="aspectRatioPlaceholder-fill" style="padding-bottom: 56.10000000000001%;"></div>
    <img class="graf-image" data-image-id="0*9lG_11ELAmuIaO-Z.jpeg" data-width="699" data-height="392" src="https://d262ilb51hltx0.cloudfront.net/max/2000/0*9lG_11ELAmuIaO-Z.jpeg">
  </div>
  <figcaption class="imageCaption">From the movie “Jiro Dreams of Sushi”</figcaption>
</figure>

@westlinkin
Copy link

How to fix this lazy loading issue then? Is there a solution?

@davidar
Copy link
Contributor

davidar commented Mar 16, 2018

I'm seeing the same issue with lazy loaded images on other websites too, like here:

<figure class="js_marquee-assetfigure align--center">
  <div class="img-wrapper lazy-image ">
    <div class="img-permalink-sub-wrapper" style="padding-bottom: 46.7%">
      <span class="js_lightbox-wrapper lightbox-wrapper">
        <picture>
          <source class="ls-small-media-source" data-srcset="https://i.kinja-img.com/gawker-media/image/upload/s--uXjLQQPG--/c_fit,fl_progressive,q_80,w_636/18iw0t0arljw4png.png" media="--small">
          <source data-srcset="https://i.kinja-img.com/gawker-media/image/upload/s--uXjLQQPG--/c_fit,fl_progressive,q_80,w_636/18iw0t0arljw4png.png" media="--xlarge">
          <source data-srcset="https://i.kinja-img.com/gawker-media/image/upload/s--uXjLQQPG--/c_fit,fl_progressive,q_80,w_636/18iw0t0arljw4png.png">
          <img src="" class="lazyload ls-lazy-image-tag" data-sizes="auto" data-width="736" data-chomp-id="18iw0t0arljw4png" data-format="png">
        </picture>
      </span>
    </div>
  </div>
</figure>

And here:

<figure class=" contains-caption ">
  <img class="article-image with-structured-caption  lazy" src="//assets.atlasobscura.com/assets/blank-11b9c95a68e295dddd0ea924647536578ce285b2c8469a223c01df1ff3166af1.png" alt="Decoded." width="auto" data-kind="article-image" id="article-image-53637" data-src="https://assets.atlasobscura.com/article_images/53637/image.jpg">
  <figcaption class="caption structured-caption noskim">Decoded. <span class="caption-credit">Photo Illustration: Aida Amer (Circular Music Sheet: David Loberg Code)</span></figcaption>
</figure>

Another (image displays but only a thumbnail):

<div class="image">
  <div id="lazy-img-327137920" class="lazy-img">
    <img src="https://assets.bwbx.io/images/users/iqjWHBFdfxIU/iIkcM6wCXudk/v0/60x-1.jpg"
         data-native-src="https://assets.bwbx.io/images/users/iqjWHBFdfxIU/iIkcM6wCXudk/v0/-1x-1.jpg" class="lazy-img__image" data-img-type="photo">
  </div>
</div>

@rejhgadellaa
Copy link

+1, would like to see this get fixed

@ldenoue
Copy link

ldenoue commented Aug 6, 2019

Suffice to add not just srcset but also data-src in the list of attributes to check. For Medium, we can specifically test whether the link contains /max/30/ and replace with max/600/ (and remove ?q=20) to fetch a higher res image.

@jprorama
Copy link

jprorama commented Sep 7, 2019

Hate to be a downer, but I'm seeing the exact opposite of this issue now. I can see the images on medium.com (tds) in reader view but can't see them in the normal web page. Here's an example link but it doesn't matter the article:

https://towardsdatascience.com/introduction-to-bayesian-networks-81031eeed94e?gi=36fbc6efa3d

I've tried the usual of clearing cookies and caches with no impact. Oddly, looking at the net traffic in developer tools shows the images do get fetched. I also see the same behavior in private mode. But, if i turn on reader mode, I see the images. That's at least a work around.

Platform Ubuntu 16.04 Firefox 68.0.2.

@Ambient-Impact
Copy link

@jprorama I always try to disable Firefox's built-in tracking protection if I'm having issues - you could try that. If you have uBlock Origin, it can still block tracking while Firefox's own is disabled.

@henningko
Copy link

henningko commented Nov 30, 2022

I fetch articles remotely, and Medium returns img elements without src attributes, which are removed by Readability, as then are the sources elements that come before. I fixed this in my fork by altering _unwrapNoscriptImages:

  _unwrapNoscriptImages: function (doc) {
    // Find img without source or attributes that might contains image, and remove it.
    // This is done to prevent a placeholder img is replaced by img from noscript in next step.
    var imgs = Array.from(doc.getElementsByTagName("img"));
    this._forEachNode(imgs, function (img) {
      for (var i = 0; i < img.attributes.length; i++) {
        var attr = img.attributes[i];
        switch (attr.name) {
          case "src":
          case "srcset":
          case "data-src":
          case "data-srcset":
            return;
        }

        if (/\.(jpg|jpeg|png|webp)/i.test(attr.value)) {
          return;
        }
      }
      // if it has a sibling with src or srcset, keep image
      var siblingElement =
        img.previousElementSibling || noscript.nextElementSibling;
      if (siblingElement.tagName === "SOURCE") {
        return;
      }
      img.parentNode.removeChild(img);
    });

@aehlke
Copy link

aehlke commented Dec 7, 2022

Could you please contribute this in a PR?

@christoph-nepa
Copy link

Is there any chance we will get this fixed?

@uriesk
Copy link

uriesk commented Jan 16, 2025

This is still a thing, especially because the img within the picture elements of a downloaded medium article doesn't have a src and gets cut out.

I decided to just choose the source from the available source elements myself:

async function chooseSourceOfMedia(document, p, typePriority) {
  let chosenSource;
  let chosenType;
  let chosenSize;
  let altText;
  for (const s of p.childNodes) {
    if (s.tagName === 'IMG' && p.tagName === 'PICTURE') {
      altText = s.alt;
      if (s.src) {
        chosenSource = s.src;
        break;
      }
    } else if (s.tagName === 'SOURCE') {
      if ((s.srcset || s.src) && !chosenSource || typePriority.indexOf(s.type) < typePriority.indexOf(chosenType)) {
        chosenType = s.type;
        chosenSize = 0;
        const sources = s.src || s.srcset;
        for (const ss of sources.split(',')) {
          let srcstr = ss.trim();
          if (srcstr.includes('.m3u8')) {
            continue
          }
          let size;
          const space = srcstr.indexOf(' ');
          if (space === -1) {
            chosenSource = srcstr;
            break;
          }
          size = parseInt(srcstr.substring(space + 1), 10);
          srcstr = srcstr.substring(0, space);
          if (Number.isNaN(size)) {
            chosenSource = srcstr;
            break;
          }
          if (size >= chosenSize) {
            chosenSize = size;
            chosenSource = srcstr;
          }
        }
      }
    } else if (s.tagName) {
      return;
    }
  }
  if (chosenSource) {
    return {
      src: chosenSource,
      alt: altText,
    }
  }
  return null;
}

/*
  * check for picture elements that have no source set in their img child,
  * or no img child at all, choose a source and replace it with an img
  */
async function chooseSourceOfPictures(document) {
  const typePriority = ['image/png', 'image/jpeg', 'image/webp', 'image/jxl', 'image/avif'];
  for (const p of document.querySelectorAll('picture')) {
    const { src, alt } = await chooseSourceOfMedia(document, p, typePriority);
    const img = document.createElement('img');
    img.src = src;
    if (alt) {
      img.alt = alt;
    }
    p.parentNode.replaceChild(img, p);
  }
}

This is of course not a general solution, because you might want to choose your favorite source, from the multiple available, differently. It also doesn't preserve attributes of the picture but replaces it with a new img.
But it is what i use right now and i thought that maybe someone finds it useful. It also fixed latimes and other wesbites for me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests