Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FATAL: MATCH('ZONE:(title) #2535

Closed
5 tasks done
ChudNiki opened this issue Sep 1, 2024 · 7 comments
Closed
5 tasks done

FATAL: MATCH('ZONE:(title) #2535

ChudNiki opened this issue Sep 1, 2024 · 7 comments
Assignees
Labels
bug rel::upcoming Upcoming release

Comments

@ChudNiki
Copy link

ChudNiki commented Sep 1, 2024

Bug Description:

Падает mantiore в fatal error при поиске с использованием в match ZONE:(title) и ZONESPAN .

В индексе примерно 10 000 000 записей.
Иногда, 1-2 раза из 10, запросы проходят нормально.

без ZONE:(title) проблемы нет.

Ниже конфиг индекса и сообщения в логе и после падения

# sudo -u manticore indexer homepages --rotate
# sudo -u manticore indexer delta_homepages --rotate

source homepages {
    type = mysql
        sql_host                = localhost
        sql_db                  = 
        sql_user                = 
        sql_pass                = 


        sql_query_pre                   = SET NAMES utf8mb4
        sql_query_pre                   = SET SESSION query_cache_type=OFF

        sql_query_range                 = SELECT MIN(id), MAX(id) FROM `homepages`
        sql_range_step                  = 200

    sql_query = SELECT `id` id, `domain`, homepage FROM `homepages` WHERE id>=$start AND id<=$end
}
source delta_homepages : homepages {
    sql_query_range                 = SELECT MIN(homepages.id), MAX(homepages.id) \
        FROM `homepages` INNER JOIN `domains` ON `homepages`.`domain` = `domains`.`idna` \
        WHERE `domains`.`reg_date` > DATE_SUB(CURDATE(), INTERVAL 40 DAY) AND age_date = reg_date AND update_date > DATE_SUB(CURDATE(), INTERVAL 40 DAY)

    sql_range_step                  = 500

    sql_query = SELECT homepages.id id, homepages.domain domain, homepage FROM `homepages`  INNER JOIN `domains` ON `homepages`.`domain` = `domains`.`idna` \
        WHERE homepages.id>=$start AND homepages.id<=$end \
          AND update_date > DATE_SUB(CURDATE(), INTERVAL 40 DAY) \
           AND `domains`.`reg_date` > DATE_SUB(CURDATE(), INTERVAL 40 DAY) AND age_date = reg_date
}
index homepages {
    type = plain

    stored_fields = domain
    docstore_compression = lz4hc

    morphology = stem_enru
    min_stemming_len = 4

    min_word_len        = 2
    min_prefix_len = 3

    path                = /var/lib/manticore/sites/homepages
    source = homepages
    html_strip = 1
    index_sp = 1
    index_zones = h*, th, title

    html_index_attrs = img=alt,title,src; a=title,href;
    html_remove_elements = style, script
}
index delta_homepages : homepages {
    source = delta_homepages
    path                = /var/lib/manticore/sites/delta_homepages
}
index full_homepages {
    type    = distributed
    local    = homepages
    local    = delta_homepages
}


indexer
{
    mem_limit = 2047M
    write_buffer = 1024M
#    max_packet_size = 4M
    max_iops = 5000
#    write_buffer = 4M

    lemmatizer_cache = 256M
}

searchd {
    listen = 0.0.0.0:9306:mysql
    log = /var/log/manticore/searchd.log
    pid_file = /var/run/manticore/searchd.pid
    expansion_limit = 32

    max_connections = 1024
    threads = 20
    thread_stack = 128M

    network_timeout = 915s

    optimize_cutoff=1
    auto_optimize = 0 # disable automatic OPTIMIZE

    telemetry = 0
    binlog_path = # disable logging
}
-------------------
/var/log/manticore/searchd.log

------- FATAL: CRASH DUMP -------
[Sun Sep  1 21:32:54.430 2024] [797947]

--- crashed SphinxQL request dump ---
SELECT count(*) as c FROM full_homepages WHERE MATCH('ZONE:(title) холодильников
рейтинг');
--- request dump end ---
--- local index:homepages
Manticore 6.3.6 593045790@24080214 (columnar 2.3.0 88a01c3@24052206) (secondary 2.3.0 88a01c3@24052206) (knn 2.3.0 88a01c3@24052206)
Handling signal 11
-------------- backtrace begins here ---------------
Program compiled with Clang 16.0.6
Configured with flags: Configured with these definitions: -DDISTR_BUILD=bullseye -DUSE_SYSLOG=1 -DWITH_GALERA=1 -DWITH_RE2=1 -DWITH_RE2_FORCE_STATIC=1 -DWITH_STEMMER=1 -DWIT
H_STEMMER_FORCE_STATIC=1 -DWITH_NLJSON=1 -DWITH_UNIALGO=1 -DWITH_ICU=1 -DWITH_ICU_FORCE_STATIC=1 -DWITH_SSL=1 -DWITH_ZLIB=1 -DWITH_ZSTD=1 -DDL_ZSTD=1 -DZSTD_LIB=libzstd.so.1
 -DWITH_CURL=1 -DDL_CURL=1 -DCURL_LIB=libcurl.so.4 -DWITH_ODBC=1 -DDL_ODBC=1 -DODBC_LIB=libodbc.so.2 -DWITH_EXPAT=1 -DDL_EXPAT=1 -DEXPAT_LIB=libexpat.so.1 -DWITH_ICONV=1 -DW
ITH_MYSQL=1 -DDL_MYSQL=1 -DMYSQL_LIB=libmariadb.so.3 -DWITH_POSTGRESQL=1 -DDL_POSTGRESQL=1 -DPOSTGRESQL_LIB=libpq.so.5 -DLOCALDATADIR=/var/lib/manticore -DFULL_SHARE_DIR=/us
r/share/manticore
Built on Linux x86_64 (bullseye) (cross-compiled)
Stack bottom = 0x7fcd19759500, thread stack size = 0x20000
Trying manual backtrace:
Something wrong with thread stack, manual backtrace may be incorrect (fp=0x20000)
Wrong stack limit or frame pointer, manual backtrace failed (fp=0x20000, stack=0x7fcd19760000, stacksize=0x20000)
Trying system backtrace:
begin of system symbols:
/usr/bin/searchd(_Z12sphBacktraceib+0x227)[0x55ee5bd38b57]
/usr/bin/searchd(_ZN11CrashLogger11HandleCrashEi+0x364)[0x55ee5bbae424]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x13140)[0x7fe62e834140]
/usr/bin/searchd(_ZN11ExtRanker_c8IsInZoneEiPK8ExtHit_tPi+0x840)[0x55ee5bde2520]
/usr/bin/searchd(_ZN16ExtConditional_TIL15TermPosFilter_e5E9ExtTerm_TILb1ELb1ELb0EEE12GetDocsChunkEv+0x1a2)[0x55ee5bed1672]
/usr/bin/searchd(_ZN8ExtAnd_c12GetDocsChunkEv+0x62)[0x55ee5be85f02]
/usr/bin/searchd(_ZN11ExtRanker_TILb1EE15GetFilteredDocsEv+0x53)[0x55ee5bde7873]
/usr/bin/searchd(_ZN17ExtRanker_State_TI24RankerState_Proximity_fnILb1ELb0EELb1EE10GetMatchesEv+0x162)[0x55ee5bde8d52]
/usr/bin/searchd(_ZNK13CSphIndex_VLN13MatchExtendedILb0ELb0ELb0ELb0ELb0ELb0ELb0EEEvR16CSphQueryContextRK9CSphQueryRK11VecTraits_TIP15ISphMatchSorterEP10ISphRankeriii+0x5b)[0
x55ee5bcf4ccb]
/usr/bin/searchd(_ZNK13CSphIndex_VLN16ParsedMultiQueryERK9CSphQueryR15CSphQueryResultRK11VecTraits_TIP15ISphMatchSorterERK9XQQuery_t17CSphRefcountedPtrI8CSphDictERK18CSphMul
tiQueryArgsP18CSphQueryNodeCachel+0xe36)[0x55ee5bc66576]
/usr/bin/searchd(+0xfc39c1)[0x55ee5bcde9c1]
/usr/bin/searchd(_ZNK13CSphIndex_VLN19RunParsedMultiQueryEiR17CSphRefcountedPtrI8CSphDictEbRK9CSphQueryR15CSphQueryResultR11VecTraits_TIP15ISphMatchSorterERK9XQQuery_tRK18CS
phMultiQueryArgsl+0xfa)[0x55ee5bc6254a]
/usr/bin/searchd(+0xfc6137)[0x55ee5bce1137]
/usr/bin/searchd(+0x1241fc2)[0x55ee5bf5cfc2]
/usr/bin/searchd(_ZZN7Threads11CoRoutine_c13CreateContextESt8functionIFvvEESt4pairIN5boost7context13stack_contextENS_14StackFlavour_EEEENUlNS6_6detail10transfer_tEE_8__invok
eESB_+0x1c)[0x55ee5cd54a2c]
/usr/bin/searchd(make_fcontext+0x2f)[0x55ee5cd968bf]
Trying boost backtrace:
 0# sphBacktrace(int, bool) in /usr/bin/searchd
 1# CrashLogger::HandleCrash(int) in /usr/bin/searchd
 2# 0x00007FE62E834140 in /lib/x86_64-linux-gnu/libpthread.so.0
 3# ExtRanker_c::IsInZone(int, ExtHit_t const*, int*) in /usr/bin/searchd
 4# ExtConditional_T<(TermPosFilter_e)5, ExtTerm_T<true, true, false> >::GetDocsChunk() in /usr/bin/searchd
 5# ExtAnd_c::GetDocsChunk() in /usr/bin/searchd
 6# ExtRanker_T<true>::GetFilteredDocs() in /usr/bin/searchd
 7# ExtRanker_State_T<RankerState_Proximity_fn<true, false>, true>::GetMatches() in /usr/bin/searchd
 8# void CSphIndex_VLN::MatchExtended<false, false, false, false, false, false, false>(CSphQueryContext&, CSphQuery const&, VecTraits_T<ISphMatchSorter*> const&, ISphRanker*
, int, int, int) const in /usr/bin/searchd
 9# CSphIndex_VLN::ParsedMultiQuery(CSphQuery const&, CSphQueryResult&, VecTraits_T<ISphMatchSorter*> const&, XQQuery_t const&, CSphRefcountedPtr<CSphDict>, CSphMultiQueryAr
gs const&, CSphQueryNodeCache*, long) const in /usr/bin/searchd
10# 0x000055EE5BCDE9C1 in /usr/bin/searchd
11# CSphIndex_VLN::RunParsedMultiQuery(int, CSphRefcountedPtr<CSphDict>&, bool, CSphQuery const&, CSphQueryResult&, VecTraits_T<ISphMatchSorter*>&, XQQuery_t const&, CSphMul
tiQueryArgs const&, long) const in /usr/bin/searchd
12# 0x000055EE5BCE1137 in /usr/bin/searchd
13# 0x000055EE5BF5CFC2 in /usr/bin/searchd
14# Threads::CoRoutine_c::CreateContext(std::function<void ()>, std::pair<boost::context::stack_context, Threads::StackFlavour_E>)::{lambda(boost::context::detail::transfer_
t)#1}::__invoke(boost::context::detail::transfer_t) in /usr/bin/searchd
15# make_fcontext in /usr/bin/searchd

-------------- backtrace ends here ---------------
Please, create a bug report in our bug tracker (https://github.com/manticoresoftware/manticore/issues)
and attach there:
a) searchd log, b) searchd binary, c) searchd symbols.
Look into the chapter 'Reporting bugs' in the manual
(https://manual.manticoresearch.com/Reporting_bugs)
Dump with GDB via watchdog
--- active threads ---
thd 0 (work_0), proto mysql, state query, command select
thd 1 (work_2), proto mysql, state query, command select
thd 2 (work_4), proto mysql, state query, command select
thd 3 (work_5), proto mysql, state query, command select
thd 4 (work_6), proto mysql, state query, command select
thd 5 (work_7), proto mysql, state query, command select
thd 6 (work_8), proto mysql, state query, command select
thd 7 (work_9), proto mysql, state query, command select
thd 8 (work_11), proto mysql, state query, command select
thd 9 (work_13), proto mysql, state query, command select
thd 10 (work_15), proto mysql, state query, command select
thd 11 (work_16), proto mysql, state query, command select
thd 12 (work_18), proto mysql, state query, command select
thd 13 (work_19), proto mysql, state query, command select
--- Totally 15 threads, and 14 client-working threads ---
------- CRASH DUMP END -------
[Sun Sep  1 21:33:01.939 2024] [797443] watchdog: main process 797947 crashed via CRASH_EXIT (exit code 2), will be restarted
[Sun Sep  1 21:33:01.940 2024] [797443] watchdog: main process 799019 forked ok
### Manticore Search Version:

Manticore 6.3.6 593045790@24080214 (columnar 2.3.0 88a01c3@24052206) (secondary 2.3.0 88a01c3@24052206) (knn 2.3.0 88a01c3@24052206)

Operating System Version:

Debian 11

Have you tried the latest development version?

None

Internal Checklist:

To be completed by the assignee. Check off tasks that have been completed or are not applicable.

  • Implementation completed
  • Tests developed
  • Documentation updated
  • Documentation reviewed
  • Changelog updated
@ChudNiki ChudNiki added the bug label Sep 1, 2024
@tomatolog
Copy link
Contributor

вы можете проверить ваш индекс homepages c помощью indextool -c manticore.conf --check homepages чтобы убедится что индекс валидный и нет ошибок в данных индекса?

@ChudNiki
Copy link
Author

ChudNiki commented Sep 2, 2024

вы можете проверить ваш индекс homepages c помощью indextool -c manticore.conf --check homepages чтобы убедится что индекс валидный и нет ошибок в данных индекса?

проверил, все нормально, ошибок нет.

Еще посмотрел разные ситуации.
Впечатление, что есть зависимость от объема данных.

второй индекс, создается каждый день.
около 64к записей.
этот запрос работает:
SELECT count(*) FROM delta_homepages WHERE MATCH('@homepage (ZONE:(title) холодильников)');

но если упала мантикора, и рестартовала,
то какое-то время, эти запросы не проходят вызывая повторные падения.
если подождать около 10 минут. то запросы к delta_homepages начинают выполнятся.

далее:
попробовал запрос усложнить

мантикора упала на небольшом индексе.

------- FATAL: CRASH DUMP -------
[Mon Sep 2 11:39:50.916 2024] [861533]

--- crashed SphinxQL request dump ---
SELECT count(*) FROM delta_homepages WHERE MATCH('@homepage (ZONE:(title,h1,h2,h3,
h4) холодильников)')


Информация по объемам индексов:
delta_homepages - 250 МБ (64к записей)
homepages - 96 ГБ (10м записей)
все индексы мантикоры 550 ГБ.
на сервере 150 ГБ оперативной памяти
16 ядер цпу

@ChudNiki
Copy link
Author

ChudNiki commented Sep 2, 2024

я не имею гарантий что html валидные.
сайты из интернета многие содержат ошибки в коде html страниц.

@tomatolog
Copy link
Contributor

вы можете загрузить ваши индекс, на котором повторяются креши, как описано в нашем manual Uploading-your-data ?
вместе в searchd.log в котором есть crash log

@ChudNiki
Copy link
Author

ChudNiki commented Sep 2, 2024

вы можете загрузить ваши индекс, на котором повторяются креши, как описано в нашем manual Uploading-your-data ? вместе в searchd.log в котором есть crash log

не получилось загрузить, у Вас не хватило места на ресурсе:

mc : Failed to copy /mnt/vdb/manticore/sites/homepages.spd. Storage backend has reached its minimum free drive threshold. Please delete a few objects to proceed.

загрузил на Яндекс.диск,

там ничего секретного нет, можете скачать, вот ссылка:

https://disk.yandex.ru/d/_PX1cgpmXWFtZw

@tomatolog
Copy link
Contributor

скачал ваш индекс и воспроизвел креш, я уведомлю о прогрессе фикса креша

@tomatolog tomatolog self-assigned this Sep 3, 2024
@tomatolog
Copy link
Contributor

the crash of daemon on ZONE or ZONESPAN was fixed at 2275ea2

@sanikolaev sanikolaev added the rel::upcoming Upcoming release label Sep 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug rel::upcoming Upcoming release
Projects
None yet
Development

No branches or pull requests

3 participants