Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

favicon fetch improvements #1786

Merged

Conversation

tycho
Copy link
Contributor

@tycho tycho commented Mar 28, 2018

This pull request improves the favicon fetching process.

Description

Here is a summary of the changes:

  • Eliminate dependency on libcurl. We already depend on Qt5Network, so why not use the HTTP fetching from there?

  • Show a progress dialog when downloading the favicon. The main utility of this is giving the user the option to cancel a download attempt (e.g. if it's taking too long). If the user clicks "cancel", it will cancel the current download attempt and try the next fallback URL in the list.

  • Try three different ways to obtain the favicon, in this order:

    • Direct to fully-qualified domain name (e.g. https://foo.bar.example.com/favicon.ico)
    • Direct to 2nd-level domain name (e.g. https://example.com/favicon.ico)
    • Google lookup for 2nd-level domain name (if enabled in settings)

I changed the Google lookup because a match is more likely to be found for the 2nd level domain than for the fully-qualified name. But if Google doesn't find a match, it doesn't actually return an error code. Instead, it returns a generic default icon, which is not really the desired result.

Future work thoughts

I don't approve of Google's behavior when it doesn't find a match. Since the API doesn't provide an error code, we don't have a programmatic way to know whether or not it actually found a result, or whether we need to look elsewhere for an icon. The best we could probably do is hash the icon and recognize that it matches the generic icon. But that's prone to breaking if the generic icon ever changes.

Because of this I'm looking into alternative fallback methods for finding the favicon.

One way would be to fetch and parse the index page for the fully-qualified domain name or second-level domain, and look for <link> elements which refer to icons. For example, here's the relevant elements inside <head> on https://doc.qt.io (line wrapped for readability):

<link rel="shortcut icon"
  href="//d33sqmjvzgs8hq.cloudfront.net/wp-content/themes/oneqt/assets/images/favicon.ico.gzip" />
<link rel="icon" type="image/png"
  href="//d33sqmjvzgs8hq.cloudfront.net/wp-content/themes/oneqt/assets/images/favicon-32x32.png" />
<link rel="icon" type="image/png"
  href="//d33sqmjvzgs8hq.cloudfront.net/wp-content/themes/oneqt/assets/images/favicon-16x16.png" />

We would probably want some strategy around choosing which icon to use. We'd probably want to prefer image/png over image/x-icon. We'd also probably prefer larger image sizes, as Qt will already scale them to fit the UI. Tha traditional favicon size of 16x16 doesn't look good at all on a high DPI display, but quite a few sites provide icons larger than that if you know where to look.

Motivation and context

I was actually looking at reducing the Windows build's install footprint. libcurl's contribution to that footprint isn't exactly trivial. In my x86_64 MinGW build, this PR reduces the install size from 66.3MB to 60.8MB. Here's the delta for the directory structure:

--- keepassxc.files.sort        2018-03-27 17:16:03.091086400 -0700
+++ keepassxc-small.files.sort  2018-03-27 17:16:03.114191300 -0700
@@ -1,8 +1,6 @@
 .
 ./KeePassXC.exe
 ./libbz2-1.dll
-./libcurl-4.dll
-./libeay32.dll
 ./libfreetype-6.dll
 ./libgcc_s_seh-1.dll
 ./libgcrypt-20.dll
@@ -14,17 +12,13 @@
 ./libicudt58.dll
 ./libicuin58.dll
 ./libicuuc58.dll
-./libidn2-0.dll
 ./libintl-8.dll
-./libnghttp2-14.dll
 ./libpcre-1.dll
 ./libpcre2-16-0.dll
 ./libpng16-16.dll
 ./libsodium-18.dll
-./libssh2-1.dll
 ./libssp-0.dll
 ./libstdc++-6.dll
-./libunistring-2.dll
 ./libwinpthread-1.dll
 ./libykpers-1-1.dll
 ./libyubikey-0.dll
@@ -32,8 +26,4 @@
 ./Qt5Gui.dll
 ./Qt5Network.dll
 ./Qt5Widgets.dll
-./ssl
-./ssl/certs
-./ssl/certs/ca-bundle.crt
-./ssleay32.dll
 ./zlib1.dll

As a nice side effect, we get a UI improvement, with the ability to cancel a favicon fetch in progress.

How has this been tested?

Aside from the test suite, I tried fetching favicons for a few different domain names and it worked as expected.

Because of the 2nd-level domain change, this also unbroke a few different favicons for me. For example, I have an entry in my database for secure.wa.aaa.com, but that fully qualified domain doesn't have a favicon, and Google just returns their generic icon for it. The 2nd-level domain aaa.com does have a favicon, and this change is able to fetch that one properly.

Types of changes

  • ✅ Bug fix (non-breaking change which fixes an issue)
  • ✅ New feature (non-breaking change which adds functionality)

Checklist:

  • ✅ I have read the CONTRIBUTING document. [REQUIRED]
  • ✅ My code follows the code style of this project. [REQUIRED]
  • ✅ All new and existing tests passed. [REQUIRED]
  • ✅ I have compiled and verified my code with -DWITH_ASAN=ON. [REQUIRED]

@droidmonkey
Copy link
Member

droidmonkey commented Mar 28, 2018

I specifically did not use QNetworkRequest because 3/4 of the desired functionality is not available prior to Qt 5.9. This will break builds on older Linux distros. Please apply your other fixes (except for progress) using Curl.

@tycho
Copy link
Contributor Author

tycho commented Mar 28, 2018

That's unfortunate. As I said in the PR, the original goal was to eliminate libcurl. The other improvements were secondary as I was in the code anyway.

Per our discussion on IRC: We should probably revisit this PR when Ubuntu Bionic is out. We could have older distros without Qt 5.9 use AppImage bundles instead.

@phoerious
Copy link
Member

We got rid of QNetworkAccessManager in the first place, because it has a horrid bug that completely breaks people's wifi on Windows and macOS.

@tycho
Copy link
Contributor Author

tycho commented Mar 28, 2018

@phoerious Can you point to said bug? How could a userspace application kill their wifi?

@phoerious
Copy link
Member

Issue #306
You can turn off bearer management with an environment variable, but that's a pretty annoying workaround which we cannot really go for.

@tycho
Copy link
Contributor Author

tycho commented Mar 28, 2018

Wow, that's mind-boggling.

What's your opposition to setting the environment variable if it fixes the problem? (Other than it being kind of gross?)

@phoerious
Copy link
Member

We need to set it before the program starts (which would need a wrapper) or at least before we create a QNetworkAccessManager instance (and then make sure it uses the right environment). It's very fiddly and I don't even know how it works on Windows.

tycho added 2 commits March 27, 2018 23:56
QFileDialog was already included, literally two lines before...

Signed-off-by: Steven Noonan <[email protected]>
Numerous improvements:

* Eliminate dependency on libcurl. We already depend on Qt5Network, so
  why not use the HTTP fetching from there?

* Show a progress dialog when downloading the favicon. The main utility
  of this is giving the user the option to cancel a download attempt
  (e.g. if it's taking too long). Canceling will try the next fallback
  URL in the list.

* Try three different ways to obtain the favicon, in this order:

  * Direct to fully-qualified domain name
    (e.g. https://foo.bar.example.com/favicon.ico)
  * Direct to 2nd-level domain name
    (e.g. https://example.com/favicon.ico)
  * Google lookup for 2nd-level domain name (if enabled in settings)

I changed the Google lookup, because a match is more likely to be found
for the 2nd level domain than for the fully-qualified name.

Google's error behavior is strange. If it doesn't find a match, it
doesn't return an error. Instead, it returns a generic default icon,
which is not really the desired result. This also means that unless we
have some way to detect that we've received the generic icon, we can't
fall back to any alternatives.

Signed-off-by: Steven Noonan <[email protected]>
@tycho tycho force-pushed the favicon-fetch-improvements branch from 54b5470 to eefe088 Compare March 28, 2018 06:57
@tycho
Copy link
Contributor Author

tycho commented Mar 28, 2018

How is it fiddly? Either the environment variable is there or it's not. And it's easy enough to just set it in main() before doing anything with Qt:

diff --git a/src/main.cpp b/src/main.cpp
index b3b607f2..2ccbaf25 100644
--- a/src/main.cpp
+++ b/src/main.cpp
@@ -45,6 +45,21 @@ Q_IMPORT_PLUGIN(QXcbIntegrationPlugin)
 #endif
 #endif

+static inline void earlyQNetworkAccessManagerWorkaround()
+{
+    // When QNetworkAccessManager is instantiated it regularly starts polling
+    // all network interfaces to see if anything changes and if so, what. This
+    // creates a latency spike every 10 seconds on Mac OS 10.12+ and Windows 7 >=
+    // when on a wifi connection.
+    // So here we disable it for lack of better measure.
+    // This will also cause this message: QObject::startTimer: Timers cannot
+    // have negative intervals
+    // For more info see:
+    // - https://bugreports.qt.io/browse/QTBUG-40332
+    // - https://bugreports.qt.io/browse/QTBUG-46015
+    qputenv("QT_BEARER_POLL_TIMEOUT", QByteArray::number(-1));
+}
+
 int main(int argc, char** argv)
 {
 #ifdef QT_NO_DEBUG
@@ -52,6 +67,8 @@ int main(int argc, char** argv)
 #endif
     Tools::setupSearchPaths();

+    earlyQNetworkAccessManagerWorkaround();
+
     Application app(argc, argv);
     Application::setApplicationName("keepassxc");
     Application::setApplicationVersion(KEEPASSX_VERSION);

@TheZ3ro
Copy link
Contributor

TheZ3ro commented Mar 28, 2018

We could have older distros without Qt 5.9 use AppImage bundles instead.

I don't think this is a solution. For example Ubuntu LTS (16.04) still has Qt5.5.1

@tycho
Copy link
Contributor Author

tycho commented Mar 28, 2018

For example Ubuntu LTS (16.04) still has Qt5.5.1

I'm not sure I see the problem with that, as 18.04's an LTS release too. It seems to make sense to target the current LTS release as the intended platform. Or is 16.04 not capable of using AppImage bundles for some reason?

@droidmonkey
Copy link
Member

Ubuntu users should be using snaps, which also alleviate this issue entirely since we bring our own Qt version.

@TheZ3ro
Copy link
Contributor

TheZ3ro commented Mar 28, 2018

I don't really like fixing a distribution problem in a platform by stop supporting a different one.
Ubuntu 16.04 LTS will be supported until April 2021 and Qt5.6 LTS until 2019.

If it was for me I will just drop everything and only support the latest version of every package, but we can't really do it.
One thing is ship binaries with Qt5.9 and support since Qt5.3, another is cut support below Qt5.9 and just ship AppImage/snaps/force user to compile Qt5.9.

AppImage after AppImage, in the end, on users' OS there are 5 different distribution of Qt: one for VLC, one for QtCreator, one for KeePassXC, etc etc...

Personally I would go lighter on users, my 2 cent

@phoerious
Copy link
Member

We need to support 14.04 at least until it's EOL and we already require Qt5.3 which is only available from a PPA, which I don't like very much. We would rather go in the opposite direction and backport. Requiring the latest Qt is not a solution and requiring the latest LTS of a Linux distribution even less so, because it's way too new to build AppImages from it.

@tycho
Copy link
Contributor Author

tycho commented Mar 29, 2018 via email

…ollowRedirectsAttribute

We just have to follow the redirect manually in such a case.

Signed-off-by: Steven Noonan <[email protected]>
@tycho
Copy link
Contributor Author

tycho commented Mar 29, 2018

I just added a change to make it support earlier Qt versions by following redirects manually instead of using QNetworkRequest::FollowRedirectsAttribute. I tested it out against one of my own domains with a relative redirect URL from favicon.ico -> favicon1.ico, which it correctly resolved and followed. I also tested it against domains that have absolute URL redirects and domains with no redirects.

It looks like TeamCity is happy with the build now.

@lordraiden
Copy link

Is there a way to recover all the favicons for the websites automatically?

@droidmonkey droidmonkey added this to the v2.3.2 milestone Apr 5, 2018
QNetworkRequest request(url);
request.setAttribute(QNetworkRequest::FollowRedirectsAttribute, true);
request.setAttribute(QNetworkRequest::RedirectPolicyAttribute, QNetworkRequest::NoLessSafeRedirectPolicy);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be possible to use these attributes if the Qt version is > 5.9? Likewise in the code above this only perform the redirect checks if the Qt version is older.

#if QT_VERSION >= QT_VERSION_CHECK(5, 9, 0)
// Do attribute stuff
#endif

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's messier to do the ifdefs. Why not just do it this way until the future when 5.9 is prolific and then move it over to the simpler way?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ifdefs are messier, but it only exposes a small population to the workaround code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I get the argument. From what I read, you're advocating for:

  1. having the less readable ifdefs,
  2. having multiple possible code paths that are chosen at compile-time,
  3. making it harder to debug if something goes wrong (which code path got used at compile time? which Qt version is it?)

Why would you want that instead of a single well-tested and readable code path that all builds use?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well when you put it that way!

@droidmonkey
Copy link
Member

@lordraiden that is covered in issue #735

Copy link
Member

@droidmonkey droidmonkey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code works great!

@droidmonkey droidmonkey merged commit 056bbaa into keepassxreboot:release/2.3.2 Apr 5, 2018
droidmonkey added a commit that referenced this pull request May 8, 2018
- Enable high entropy ASLR on Windows [#1747]
- Enhance favicon fetching [#1786]
- Fix crash on Windows due to autotype [#1691]
- Fix dark tray icon changing all icons [#1680]
- Fix --pw-stdin not using getPassword function [#1686]
- Fix placeholders being resolved in notes [#1907]
- Enable auto-type start delay to be configurable [#1908]
- Browser: Fix native messaging reply size [#1719]
- Browser: Increase maximum buffer size [#1720]
- Browser: Enhance usability and functionality [#1810, #1822, #1830, #1884, #1906]
- SSH Agent: Parse aes-256-cbc/ctr keys [#1682]
- SSH Agent: Enhance usability and functionality [#1677, #1679, #1681, #1787]
jtl999 pushed a commit to jtl999/keepassxc that referenced this pull request Jul 29, 2018
* Eliminate dependency on libcurl in favor of Qt5Network code
* Supports older Qt versions without QNetworkRequest::FollowRedirectsAttribute

* Show a progress dialog when downloading the favicon. The main utility
  of this is giving the user the option to cancel a download attempt
  (e.g. if it's taking too long). Canceling will try the next fallback URL in the list.

* Try three different ways to obtain the favicon, in this order:
  1) Direct to fully-qualified domain (e.g. https://foo.bar.example.com/favicon.ico)
  2) Direct to 2nd-level domain (e.g. https://example.com/favicon.ico)
  3) Google lookup for 2nd-level domain name (if enabled in settings)

I changed the Google lookup, because a match is more likely to be found
for the 2nd level domain than for the fully-qualified name.

Google's error behavior is strange. If it doesn't find a match, it
doesn't return an error. Instead, it returns a generic default icon,
which is not really the desired result. This also means that unless we
have some way to detect that we've received the generic icon, we can't
fall back to any alternatives.

Signed-off-by: Steven Noonan <[email protected]>
kneitinger added a commit to kneitinger/keepassxc that referenced this pull request Sep 19, 2018
kneitinger added a commit to kneitinger/keepassxc that referenced this pull request Sep 20, 2018
droidmonkey pushed a commit to kneitinger/keepassxc that referenced this pull request Sep 20, 2018
droidmonkey pushed a commit that referenced this pull request Sep 21, 2018
…er (#2309)

* Replace Google with DuckDuckGo for optional fallback favicon fetch URL
Modify the work initially done in #36, and most recently modified in #1786,
to use DuckDuckGo's https://icons.duckduckgo.com/ip3/www.example.com.ico
favicon endpoint.

Fixes #2258

* Close failed favicon fetch progress bars

Name the UrlFetchProgressDialog() with the corresponding URL in order to
be identified by name by its parent when the failed request is handeled
in EditWidgetIcons::fetchFinished(). fetchFinished() retrieves the
relevant UrlFetchProgressDialog() and calls close() on it.

Fixes: #2265
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants