-
Notifications
You must be signed in to change notification settings - Fork 11k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make all chats searchable #309
Comments
👍 don't forget this is also one of the reasons many ppl leave slack... they go over the 10k limit, they then need to search back into their search history and they can't ... ;) |
How we can do this with a great performance? |
Some ideas. All with limitations and trade-offs. https://www.sqlite.org/fts3.html https://github.com/olivernn/lunr.js + webworkers A background doc (explores problem domain): https://docs.google.com/document/d/1sAk00RsxZHFgyKomKq_n01rHbvsghS8RkQjulBZ3mpI/edit?pli=1#heading=h.p4r6t7cyneha Our problem is compounded by:
|
i have to say I know very little about the domain (very interesting doc, thanks for sharing @Sing-Li! ), but i've been hearing really good things about elasticsearch.. including the fact that it's ridiculously easy to setup and get started with.. might be worth a look? p.s. i was always referring to server-side search |
The problem with elastic search is the application setup and data duplication. Mongodb has text search by collection, maybe we can use it and keep the setup easy. We have plans to add support for postgresql that has text search too.
|
hmmm not sure I follow you with regards to 'application setup and data I am not sure if I understand correctly what you mean about using the Yorgos Saslis On 22 August 2015 at 20:51, Rodrigo Nascimento [email protected]
|
@gsaslis Adding elastic search to the application means 1 more program to setup, keep running, consuming memory and disc (we need to send all data to search to be indexed by elastic search, so we have data duplication). MongoDB has text index per collection: http://docs.mongodb.org/v3.0/core/index-text/ With ElasticSearch we can create more powerful searches, probably global searches with 1 query, but we are adding more complexity to app setup and hardware requirements. IHMO we need both options, but start with the most simple for end users keeping the ability to clone and just run the application with meteor run allowing a good search option. And then we can work in a ElasticSearch integration as well other solutions. |
@rodrigok agreed on extra maintenance work required. also agreed on data duplication. (as long as you don't replace mongodb with elasticsearch that is, of course.) Regardless, as mentioned in the document shared by @Sing-Li, search has 3 important challenges: tokenization, selection and ranking. From what little I've read so far, neither pg nor mongo (attempt to) solve all of them, while ES does.. (main difference in particular is ranking, and ES also offers highlighting). though I appreciate your point about implementing some basic search functionality first, then going into the full-blown solution as a staged approach, I have to say that it feels to me this will lead you down a path of constantly having to 'fix' the 'broken' / non-performant basic search... (i.e. how well will this scale when you have teams with tens of millions of messages? how much time will you need to put into database optimization? performance tuning is a very time consuming task..) A final point to consider is how important the search functionality is for rocket.chat users... This is something I don't really know yet (though I personally feel it is, and Slack has it on their home page as the #3 value proposition) but depending on the answer here, the search feature might be more important than the extra setup cost for the sysadmin... ; ) |
If users don't install the ES we will lost the search functionality that we can do using only the database already in use? IMHO, if users wants a better search they can install and integrate ES, but I don't think ES as a requirement to have Rocket.Chat running but search is a requirement. I'm missing the opinion of @marceloschmidt, @sampaiodiego and @engelgabriel in this thread. |
so, I agree with @rodrigok that we do not have elasticsearch as a requirement to run rocket.chat so, I think we need to design a search abstraction layer.. so the search API will be unique, even using ES or not.. ES will be a "driver" or an option of this API.. this way we can start develop the "native" and ES support together, and support any other "search lib" in the future. =) |
hmm @sampaiodiego does the abstraction approach really sound suitable for your underlying search infrastructure.. ? I like your general approach, but don't you think that making the sysadmin (who is setting up rocket.chat for his team) have to make a decision on what search capabilities he will need - during the installation process - is a little worse than a clear instruction to install/setup one or the other solution? : ) I really don't mean to be pushing you guys to adopt ES, so please don't take this the wrong way! : ) I'm simply trying to make a point -- and this is very much related to the long-term goal/vision for rocket.chat ... I don't know that myself, so I'm looking for your guidance here on what would be expected from the users (btw, a user is not the person setting up rocket.chat, right? it's the guys and girls using it everyday) .. ; ) Thank you all for your input and consideration!!! |
@sampaiodiego 👍 provider can be mongo, sql, esearch, client-side (client side is important because my new phone has 8 cores and 4g of ram servicing me only ......our server has 4 shared cores and 2g of ram servicing 10,000 users.....and the trend continues) |
With new Meteor using 2.6 MongoDB, you could just use full text search support of MongoDB. That would be easiest to implement and it would just work without any extra external services. |
@rodrigok implemented just what @mitar said. I believe we can give that a good push and see if it is enough. If it is not, we can add ES later. It can be definitely a feature that can be added after initial installation, there is no problem there. I've seen a lot of companies doing just that. All is needed is a initial script that will read all existing messages to kickstart the index and follow the oplog from there. To test the current implementation, you just need to start typing on the search area on the right tab. It is very basic, only searches the current channel. But i'll close the issue and open more specific ones for improvements. |
Anybody willing to upgrade the bounty, I just added $200 https://www.bountysource.com/issues/28998888-global-search-across-channels |
https://github.com/mfisher35/rc_search I made this search api (just does a text search to mongodb and is extremely fast) only issue is it is truly global, users may get results back from channels they are not supposed to see. |
Maybe we could have this on channels history
The text was updated successfully, but these errors were encountered: