-
Notifications
You must be signed in to change notification settings - Fork 3
Repeating images #2
Comments
This github article is a problem if apple-touch-icon-144.png is excluded Either the icon is not excluded (but this may ruin other github pages) or image resizing is used to resize https://cloud.githubusercontent.com/assets/22635/3517580/2399aa10-06f1-11e4-8671-0923504c594a.png. |
Added these to extralist but the one it should pick is too big. images.forbes.com/images/channels/headercoverstory* |
Forbes images - have included some regex filters in extralist but the main one is excluded due to image size. Perhaps consider image resampling/resizing using Pillow? |
a16z.com - the map and office photo are excluded but the article image is not found as it's embedded in the following: Perhaps a custom extractor can be used here. |
Ok so here's one option: Second, we could have custom extraction handlers based on the link url as you said. Re image manipulation (resizing etc) we could try http://thumbor.org/ in the future, think I mentioned it before. But let's leave this for later.. |
Looks like a candidate for PhantomJS: googleandyourbusiness.blogspot.com |
Following images should be specified on extralist.txt
fcw.com -> http://fcw.com/design/gig/fcw/2012/img/fcw-logo.png
github.com -> https://github.com/apple-touch-icon-144.png
www.forbes.com -> http://images.forbes.com/images/channels/headercoverstory*
a16z.com -> http://a16z.files.wordpress.com/2014/01/7cb56ea5114a9f0e92d53bf0e171d15d.png
www.gv.com -> http://img.gv.com/wp-content/uploads*
The text was updated successfully, but these errors were encountered: