Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Design Doc] In Doc Search UI #5707

Merged
merged 21 commits into from
Jun 12, 2019
Merged
Show file tree
Hide file tree
Changes from 16 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitattributes
Original file line number Diff line number Diff line change
Expand Up @@ -13,3 +13,4 @@
*.whl binary
*.woff binary
*.woff2 binary
*.gif binary
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
175 changes: 175 additions & 0 deletions docs/design/in-doc-search-ui.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,175 @@
In Doc Search UI
================

Giving the user the ability to easily search the information
dojutsu-user marked this conversation as resolved.
Show resolved Hide resolved
that they are looking for is important for us.
We have already upgraded to the latest version of `Elasticsearch`_ and
we plan to implement `search as you type` feature for all the documentations hosted by us.
It will be designed to provide instant results as soon as the user starts
typing in the search bar with a clean and minimal frontend.
This design document aims to provides the details of it.
This is a GSoC'19 project.
The final result may look something like this:

.. figure:: ../_static/images/design-docs/in-doc-search-ui/in-doc-search-ui-demo.gif
:align: center
:target: ../_static/images/design-docs/in-doc-search-ui/in-doc-search-ui-demo.gif
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think if you put just in-doc-search-ui-demo.gif in both places here, Sphinx will just do the magic. Not sure, though

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just tried it.
if we do this... then it is looking for the image in the same folder i.e., design/in-doc-search-ui.gif.


Short demo


Goals And Non-Goals
-------------------

Project Goals
++++++++++++++

* Support a search-as-you-type/autocomplete interface.
* Support across all (or virtually all) Sphinx themes.
* Support for the JavaScript user experience down to IE11 or graceful degradation where we can't support it.
* Project maintainers should have a way to opt-in/opt-out of this feature.
* (Optional) They should have the flexibility to change some of the styles using custom CSS and JS files.
dojutsu-user marked this conversation as resolved.
Show resolved Hide resolved

Non-Goals
++++++++++

* For the initial release, we are targeting only Sphinx documentations
as we don't index MkDocs documentations to our Elasticsearch index.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Of note, supporting all Sphinx themes will also make it much easier for us to support mkdocs. A large part of the issue with around the UI/UX and theme integration, so we should hopefully be able to work around that now :)



Existing Search Implementation
------------------------------

We have a detailed documentation explaing the underlying architecture of our search backend
and how we index documents to our Elasticsearch index.
You can read about it :doc:`here <../development/search>`.


Proposed Architecture for In-Doc Search UI
------------------------------------------

Frontend
++++++++

Technologies
~~~~~~~~~~~~

Frontend is to designed in a theme agnostics way. For that,
we explored various libraries which may be of use but none of them fits our needs.
So, we might be using vanilla JavaScript for this purpose.
This will provide us some advantages over using any third party library:

* Better control over the DOM.
* Performance benefits.


Proposed Architecture
~~~~~~~~~~~~~~~~~~~~~

We plan to select the search bar, which is present in every documentation,
dojutsu-user marked this conversation as resolved.
Show resolved Hide resolved
using the `querySelector()`_ method of JavaScript.
dojutsu-user marked this conversation as resolved.
Show resolved Hide resolved
Then add an event listener to it to listen for the changes and
fire a search query to our backend as soon as there is any change.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I'm about to say doesn't need to be added to this document but may be helpful to you more generally. Sometimes events like this are "debounced". In this instance, that means that the search query will only be fired every X number of milliseconds and only when the user stops typing. While you could use vanilla JS for this, there is a handy version in underscore already and underscore is already included by Sphinx (under the variable $u instead of the usual).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @davidfischer.
But what will be the benefit of debouncing the event?
will it not make our suggestions loading time look slower?

Our backend will then return the suggestions,
which will be shown to the user in a clean and minimal UI.
We will be using `document.createElement()`_ and `node.removeChild()`_ method
provided by JavaScript as we don't want empty `<div>` hanging out in the DOM.

We have a few ways to include the required JavaScript and CSS files in all the projects:

* Add CSS into `readthedocs-doc-embed.css` and JS into `readthedocs-doc-embed.js`
and it will get included.
* Package the in-doc search into it's own self-contained CSS and JS files
and include them in a similar manner to `readthedocs-doc-embed.*`.
* It might be possible to package up the in-doc CSS/JS as a sphinx extension.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this would be a voting, I'd vote for this one :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would also vote for that one. We are currently in this direction only.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My only concern with the extension is that it doesn't work locally. You can build your docs locally, but even if we hit the prod API, it will only return results from prod docs. This feels like a weird thing to do with an extension when it's inherently linked to our RTD production builds.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ericholscher
We have a global object READTHEDOCS_DATA and it contains api_host. In the extension, we are taking the API host from there.
So in local, it is - http://127.0.0.1:8000
And we can use the extension locally (given that the Elasticsearch is correctly set up).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, but that requires a running instance of RTD -- I mean it isn't useful outside of RTD. If you already have a local RTD instance, then bundling it with RTD would have the same outcome :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes... that is the limitation. 😕

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Split of code is a win to me. Easier to contribute, easier to focus when debugging, easier to keep up to date, etc. The downside is deploying a bugfix immediately, where core & extension need to be deployed together.

I understand that could feel weird that the results come from production while you are seeing a different set of docs. The extension could disable itself if detecting that it's running outside RTD if we can avoid that weird UX. Not a solution, though.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it will increase complexity in development environment..!!

This might be nice because then it's easy to enable it on a per-project basis.
When we are ready to roll it out to a wider audience,
we can make a decision to just turn it on for everybody (put it in `here`_)
or we could enable it as an opt-in feature like the `404 extension`_.


UI/UX
~~~~~

We have two ways which can be used to show suggestions to the user.

* Show suggestions below the search bar.
* Open a full page search interface when the user click on search field.


Backend
+++++++

We have a few options to support `search as you type` feature,
but we need to decide that which option would be best for our use-case.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using one backend or the other does involve a lot of changes or it's just changing a setting and a class or similar? I want to know if this is something that we can test and see how it perform and decide after that or if testing this way is complicated since they are two different implementations.

Copy link
Member Author

@dojutsu-user dojutsu-user Jun 6, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we want to have results like comment #1 then we have to do some changes in the backend and a new API endpoint. The changes may increase the size of elasticsearch index to x3 or x5. (In the demo, I did changes with my elasticsearch index locally to produce the results.)
Further changes can be.. like if we want to support section linking. Currently we just open the result page and user have to scroll down to the required section. For that... there will be considerable big changes. many parts of the elasticsearch will change -- how we index data/search data.
I don't think that we can test this without any changes.

Without any changes is okay too...the test-builds - https://readthedocs.org/projects/test-builds-dojutsu-user/versions/ are without any changes.


Edge NGram Tokenizer
~~~~~~~~~~~~~~~~~~~~

* Pros

* More effective than Completion Suggester when it comes to autocompleting
words that can appear in any order.
* It is considerable fast because most of the work is being done at index time,
hence the time taken for autocompletion is reduced.
* Supports highlighting of the matching terms.

* Cons

* Requires greater disk space.


Completion Suggester
~~~~~~~~~~~~~~~~~~~~

* Pros

* Really fast as it is optimized for speed.
* Does not require large disk space.

* Cons

* Matching always starts at the beginning of the text. So, for example,
"Hel" will match "Hello, World" but not "World Hello".
* Highlighting of the matching words is not supported.
* According to the official docs for Completion Suggester,
fast lookups are costly to build and are stored in-memory.


Milestones
----------

+-----------------------------------------------------------------------------------+------------------+
| Milestone | Due Date |
+===================================================================================+==================+
| A local implementation of the project. | 12th June, 2019 |
+-----------------------------------------------------------------------------------+------------------+
| In-doc search on a test project hosted on Read the Docs using the RTD Search API. | 20th June, 2019 |
+-----------------------------------------------------------------------------------+------------------+
| In-doc search on docs.readthedocs.io. | 20th June, 2019 |
+-----------------------------------------------------------------------------------+------------------+
| Friendly user trial where users can add this on their own docs. | 5th July, 2019 |
+-----------------------------------------------------------------------------------+------------------+
| Additional UX testing on the top-10 Sphinx themes. | 15th July, 2019 |
+-----------------------------------------------------------------------------------+------------------+
| Finalize the UI. | 25th July, 2019 |
+-----------------------------------------------------------------------------------+------------------+
| Improve the search backend for efficient and fast search results. | 10th August, 2019|
+-----------------------------------------------------------------------------------+------------------+


Open Questions
++++++++++++++

* Should we rely on jQuery, any third party library or pure vanilla JavaScript?
* Are the subprojects to be searched?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but the backend API should already be doing this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. It is a question when we are changing/updating the backend.

* Is our existing Search API is sufficient?
* Should we go for edge ngrams or completion suggester?


.. _Elasticsearch: https://www.elastic.co/products/elasticsearch
.. _querySelector(): https://developer.mozilla.org/en-US/docs/Web/API/Document/querySelector
.. _document.createElement(): https://developer.mozilla.org/en-US/docs/Web/API/Document/createElement
.. _node.removeChild(): https://developer.mozilla.org/en-US/docs/Web/API/Node/removeChild
.. _here: https://github.com/rtfd/readthedocs.org/blob/9ca5858e859dea0759d913e8db70a623d62d6a16/readthedocs/doc_builder/templates/doc_builder/conf.py.tmpl#L135-L142
.. _404 extension : https://github.com/rtfd/sphinx-notfound-page