Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Search this topic does not deliver any results #7870

Open
7wells opened this issue Nov 12, 2023 · 13 comments
Open

Search this topic does not deliver any results #7870

7wells opened this issue Nov 12, 2023 · 13 comments
Assignees
Labels

Comments

@7wells
Copy link

7wells commented Nov 12, 2023

Description

Search This topic does not deliver any results.

Steps to reproduce

  1. Visit a topic.
  2. Select This topic from drop-down menu.
  3. Enter a search term in the search field.
  4. Click SEARCH button.

smf

Environment (complete as necessary)

  • Version/Git revision: 2.1.3
  • Database Type: ??
  • Database Version: ??
  • PHP Version: ??

As I am not the board (https://forum.fhem.de) owner, I cannot provide details about database and PHP, sorry.

Additional information/references

None

@Oldiesmann
Copy link
Contributor

Works fine for me on 2.1.4 at seniorsandfriends.org

@Sesquipedalian
Copy link
Member

Sesquipedalian commented Nov 27, 2023

Since no one else can reproduce the issue and @7wells gave a thumbs up to @Oldiesmann's reply, I am interpreting that to mean that the issue can be closed. If that is incorrect and the issue is still occurring for you, @7wells, please feel free to reply to this with more information about how we can reproduce the problem.

@isaak654
Copy link
Contributor

isaak654 commented Apr 8, 2024

Issue confirmed by @sbulen in https://www.simplemachines.org/community/index.php?msg=4172803

Yep, confirmed. I think this is a bug.

Guests get the not allowed message.

Regular users get a no results message.

Admins get proper results.

@Sesquipedalian Sesquipedalian reopened this Apr 9, 2024
@Oldiesmann
Copy link
Contributor

This affects SMF 3.0 as well.

@Oldiesmann
Copy link
Contributor

Seems to be fixed for 3.0 now with the latest search improvements

@isaak654
Copy link
Contributor

@Sesquipedalian Please consider removing the “Confirmation needed” label and assigning a milestone for 2.1.x.

@sbulen
Copy link
Contributor

sbulen commented Oct 20, 2024

From: https://www.simplemachines.org/community/index.php?msg=4181419

Pretty sure it's not tied to the php or mysql version. My test copy (php 8.3, mysql 8.4; I also tried php 7.4) shows the same behavior as production (php 8.2, mysql 8.0 percona). My forum uses a custom index.

In my forum, it doesn't always fail... But there are patterns...

  • It fails for certain terms in certain topics/boards, for non-admins & guests, in a repeatable fashion.
    • The same user may get proper search results on any other board, for any other search term.
    • The same user may get proper search results using a different search term in the same topic or board.
    • If the same user performs a forum-level search, the results for that topic/board appear to be missing... (!!!)
    • New posts with the same term in the same topic/board do not show up either... (I.e., it's not just migrated data.)
    • Guests are allowed to search, & may get a "You are not allowed to access this section" error first time thru - specifically for the problem query in the problem topic/board. If guests have otherwise passed the required captcha, they may instead just get a no matching search results message.
  • The words are indexed properly, or it would fail for the admin, too. They look fine in the db in the custom search index.
  • Board permissions match all the other public boards - nothing unique about them.
  • Clearing cache makes no difference.

Sure feels like a query/join issue. Something along the lines of one of the posts with that term is in an off-limits board for that user, and it's bubbling up & affecting the query somehow. It's possible this doesn't affect folks using 3rd party search tools, e.g., this forum uses Sphinx, IIRC...

@sbulen
Copy link
Contributor

sbulen commented Oct 20, 2024

So... There are a number of different variants on this in the code, but search basically does a high level word search, stores the complete first-pass result set into log_search_results, then applies any filters if necessary.

I did two queries in a row, one from an admin (that works, id=130), and one from a user (that doesn't, id=131). In the first stage, they both return the same 9 results, so far, so good:
image

The problem occurs in one of the subsequent filtering steps. The admin, of course, is not filtered. But the user result gets filtered based on topic approval state here:

WHERE lsr.id_search = {int:id_search}' . $approve_query . '

Which looks something like this at execution time:

SELECT 3207 AS id_topic, lsr.id_msg, lsr.relevance, lsr.num_matches
FROM smf_log_search_results AS lsr
INNER JOIN smf_topics AS t ON (t.id_topic = lsr.id_topic)
WHERE lsr.id_search = 131
AND (t.approved = 1 OR t.id_member_started = 3087)

Which looks harmless enough... But... Look carefully at the screenshot above... For some reason, for this particular path thru the logic, those aren't topic IDs in the topic ID column, they're message IDs... So this query effectively filters out most/all of the search results...

Still looking into this, but I think this is the culprit right here:

$main_query['select']['id_topic'] = 'm.id_msg AS id_topic';

@sbulen
Copy link
Contributor

sbulen commented Oct 21, 2024

Note 3.0 has the same code:

$main_query['select']['id_topic'] = 'm.id_msg AS id_topic';

@sbulen
Copy link
Contributor

sbulen commented Oct 22, 2024

I suspect the quirky code above (stuffing msg IDs into the topic ID column...) is a bit of a red herring.... Mainly because the comment above that code ("Outrageous!!!") is a clue they knew they were doing something weird... I sure wish there were a better clue in the comments than that, tho...

(Great job on the 3.0 refactor, @Sesquipedalian ... It seems to have ironed out several quirks....)

@sbulen
Copy link
Contributor

sbulen commented Oct 25, 2024

This has been bugging me like a sore tooth... I have started getting complaints on my 2.1 forum... Once they noticed, they noticed...

I have confirmed the problem exists in 3.0, 2.1, and even 2.0. (edited my comments above...)

Here is what I think is going on:

  • The structure for log_search_results has topic ID as part of its primary key.
  • Search displays message-level detail when searching within topics (and also when showing complete results).
  • So... The reason they stuff id_msg into id_topic ("Outrageous!") is to not lose the detail that would otherwise be lost upon INSERT IGNORE...
  • But from that point on, they cannot treat id_topic on log_search_results as an id_topic...
  • But they do, when checking whether the topic has been approved. This only happens when post moderation is active.
    if (!empty($modSettings['postmod_active']))

So, to reproduce in 3.0, you must enable post moderation. For some reason, post moderation is NOT enabled by default in 3.0, but it is enabled by default in 2.1. (This is why I initially thought 3.0 was OK.)

With post moderation ON, any message ID that doesn't point to a valid topic, when joined on id_msg = topic_id, is excluded from search results. Which happens a lot, actually, because you tend to have a lot more messages than topics. It will get much worse on more recent messages & topics on mature forums. Or forums with lots of replies.

The problem happens for any search engine, as this particular logic (post moderation check) is independent of the API.

Admins are not affected as they can see unapproved topics.

@sbulen
Copy link
Contributor

sbulen commented Oct 25, 2024

In 3.x, I think the structure should change. Get rid of the workarounds. Instead of relying on INSERT IGNORE to lop off dupes, fix the GROUP BYs instead. There is so much inscrutable logic in there, with dynamically built queries that are quite difficult to read... But I think the reality is it would simplify things a lot. It is much more convoluted than it needs to be. At its core, these are just approval & permissions checks... No need to get so complicated.

...And don't stuff id_msg into an id_topic column.

This was referenced Oct 25, 2024
@live627
Copy link
Contributor

live627 commented Oct 25, 2024

I'd like to see the end result. The search area is one place I could never really figure out what all was going on.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants