Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updating index causes a crash #403

Closed
damirt2020 opened this issue Aug 28, 2020 · 11 comments
Closed

Updating index causes a crash #403

damirt2020 opened this issue Aug 28, 2020 · 11 comments
Labels

Comments

@damirt2020
Copy link

damirt2020 commented Aug 28, 2020

I am now able to reproduce the crash consistently and have the index file ready for you cleansed of any customer sensitive data. Basically, if you run this query, the searchd will crash right away

update dx1parts set dealershipid='0' where dealershipid <> '0';

The manticore folder containing index and manticore.conf are uploaded to your ftp server in github-issue-403 folder.

------- FATAL: CRASH DUMP -------
[Fri Aug 28 22:45:34.448 2020] [24647]

--- crashed SphinxQL request dump ---
update dx1parts set dealershipid='0' where dealershipid <> '0'
--- request dump end ---
--- local index:dx1parts
Manticore 3.5.0 1d34c49@200722 release
Handling signal 11
-------------- backtrace begins here ---------------
Program compiled with 4.8.5
Configured with flags: Configured by CMake with these definitions: -DCMAKE_BUILD_TYPE=RelWithDebInfo -DDISTR_BUILD=rhel7 -DUSE_SSL=ON -DDL_UNIXODBC=1 -DUNIXODBC_LIB=libodbc.so.2 -DDL_EXPAT=1 -DEXPAT_LIB=libexpat.so.1 -DUSE_LIBICONV=1 -DDL_MYSQL=1 -DMYSQL_LIB=libmysqlclient.so.18 -DDL_PGSQL=1 -DPGSQL_LIB=libpq.so.5 -DLOCALDATADIR=/var/data -DFULL_SHARE_DIR=/usr/share/manticore -DUSE_RE2=1 -DUSE_ICU=1 -DUSE_BISON=ON -DUSE_FLEX=ON -DUSE_SYSLOG=1 -DWITH_EXPAT=1 -DWITH_ICONV=ON -DWITH_MYSQL=1 -DWITH_ODBC=ON -DWITH_PGSQL=1 -DWITH_RE2=1 -DWITH_STEMMER=1 -DWITH_ZLIB=ON -DGALERA_SONAME=libgalera_manticore.so.31 -DSYSCONFDIR=/etc/manticoresearch
Host OS is Linux runner-fa6cab46-project-3858465-concurrent-0 4.19.78-coreos #1 SMP Mon Oct 14 22:56:39 -00 2019 x86_64 x86_64 x86_64 GNU/Linux
Stack bottom = 0x7fd804020b90, thread stack size = 0x20000
Trying manual backtrace:
Something wrong with thread stack, manual backtrace may be incorrect (fp=0x591b10)
Wrong stack limit or frame pointer, manual backtrace failed (fp=0x591b10, stack=0x7fd804020000, stacksize=0x20000)
Trying system backtrace:
begin of system symbols:
/usr/bin/searchd(_Z12sphBacktraceib 0x90)[0x723f30]
/usr/bin/searchd(_ZN11CrashLogger11HandleCrashEi 0x1b2)[0x592182]
/lib64/libpthread.so.0( 0xf630)[0x7fd822e81630]
/usr/bin/searchd[0x8cfe20]
/usr/bin/searchd(_ZNK9CSphMatch13FetchAttrDataERK15CSphAttrLocatorPKhRi 0x14)[0x677d24]
/usr/bin/searchd(_ZNK14FilterString_c4EvalERK9CSphMatch 0x25)[0x847d75]
/usr/bin/searchd(_ZNK10Filter_Not4EvalERK9CSphMatch 0xe)[0x84591e]
/usr/bin/searchd(_ZNK13CSphIndex_VLN12ScanByBlocksERK16CSphQueryContextP15CSphQueryResultiPP15ISphMatchSorterR9CSphMatchibbil 0x478)[0x690278]
/usr/bin/searchd(_ZNK13CSphIndex_VLN9MultiScanEPK9CSphQueryP15CSphQueryResultiPP15ISphMatchSorterRK18CSphMultiQueryArgs 0xaf0)[0x6aac00]
/usr/bin/searchd(_ZNK13CSphIndex_VLN10MultiQueryEPK9CSphQueryP15CSphQueryResultiPP15ISphMatchSorterRK18CSphMultiQueryArgs 0x6b8)[0x6ba9a8]
/usr/bin/searchd[0x883fa3]
/usr/bin/searchd[0x9607e7]
/usr/bin/searchd(_ZN7Threads10CoExecuteNEOSt8functionIFvvEEi 0x17f)[0x95c57f]
/usr/bin/searchd(_ZNK9RtIndex_c10MultiQueryEPK9CSphQueryP15CSphQueryResultiPP15ISphMatchSorterRK18CSphMultiQueryArgs 0x9e3)[0x8917e3]
/usr/bin/searchd(_ZNK9RtIndex_c12MultiQueryExEiPK9CSphQueryPP15CSphQueryResultPP15ISphMatchSorterRK18CSphMultiQueryArgs 0x77)[0x893427]
/usr/bin/searchd(_ZN15SearchHandler_c16RunLocalSearchesEv 0x62b)[0x5d0f8b]
/usr/bin/searchd(ZN15SearchHandler_c14RunActionQueryERK9CSphQueryRK10CSphStringPS3 0x1c8)[0x5dc6f8]
/usr/bin/searchd(_Z25HandleMySqlExtendedUpdateR14AttrUpdateArgs 0x7c)[0x5dcbcc]
/usr/bin/searchd(_ZN15CommitMonitor_c6UpdateER10CSphString 0xb8)[0x617338]
/usr/bin/searchd[0x626452]
/usr/bin/searchd[0x5b771d]
/usr/bin/searchd(Z20sphHandleMysqlUpdateR19StmtErrorReporter_iRK9SqlStmt_tRK10CSphStringRS4 0x213)[0x5de703]
/usr/bin/searchd(_ZN16CSphinxqlSession7ExecuteERK10CSphStringR11RowBuffer_i 0x7d6)[0x5fe916]
/usr/bin/searchd[0x66a6d8]
/usr/bin/searchd(_Z8SqlServe11SharedPtr_TIP13SockWrapper_c9Deleter_TIS1_L5ETYPE0EE16ISphRefcountedMTE 0x8e8)[0x66b488]
/usr/bin/searchd[0x666b2a]
/usr/bin/searchd(ZZN7Threads11CoRoutine_cC1ESt8functionIFvvEEmENUlN5boost7context6detail10transfer_tEE_4_FUNES7 0x17)[0x95cc07]
/usr/bin/searchd(make_fcontext 0x2f)[0x96218f]
-------------- backtrace ends here ---------------
Please, create a bug report in our bug tracker (https://github.com/manticoresoftware/manticore/issues)
and attach there:
a) searchd log, b) searchd binary, c) searchd symbols.
Look into the chapter 'Reporting bugs' in the documentation
(http://docs.manticoresearch.com/latest/html/reporting_bugs.html)
Dump with GDB via watchdog
[Fri Aug 28 22:45:34.679 2020] [24646] watchdog: got USR1, performing dump of child's stack
Will run gdb on '/usr/bin/searchd', pid '24647'
Error reading attached process's symbol file.
: No such file or directory.
Error reading attached process's symbol file.
: No such file or directory.
[New LWP 26495]
[New LWP 25494]
[New LWP 24652]
[New LWP 24650]
[New LWP 24649]
[New LWP 24648]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
0x00007fd821d5c9a3 in select () from /lib64/libc.so.6
Id Target Id Frame
7 Thread 0x7fd82396e700 (LWP 24648) "work_0" 0x00007fd821d5c9a3 in select () from /lib64/libc.so.6
6 Thread 0x7fd82394d700 (LWP 24649) "work_1" 0x00007fd822e7da35 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
5 Thread 0x7fd82392c700 (LWP 24650) "TaskSched" 0x00007fd822e7dde2 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
4 Thread 0x7fd8238ca700 (LWP 24652) "TickPool_0" 0x00007fd821d65eb3 in epoll_wait () from /lib64/libc.so.6
3 Thread 0x7fd73f79c700 (LWP 25494) "sphTimer" 0x00007fd821d5c9a3 in select () from /lib64/libc.so.6
2 Thread 0x7fd823889700 (LWP 26495) "TaskW_3" 0x00007fd822e7dde2 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0

  • 1 Thread 0x7fd823970900 (LWP 24647) "searchd" 0x00007fd821d5c9a3 in select () from /lib64/libc.so.6

Thread 7 (Thread 0x7fd82396e700 (LWP 24648)):
#0 0x00007fd821d5c9a3 in select () from /lib64/libc.so.6
#1 0x00000000006704f4 in ?? ()
#2 0x0000000000000002 in ?? ()
#3 0x00000000000a34f3 in ?? ()
#4 0x00000000000000f5 in ?? ()
#5 0x0000000000723fb0 in ?? ()
#6 0x0000000000000000 in ?? ()

Thread 6 (Thread 0x7fd82394d700 (LWP 24649)):
#0 0x00007fd822e7da35 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1 0x000000000093f9d4 in ?? ()
#2 0x00007fd82394cc5f in ?? ()
#3 0x0000000002b41930 in ?? ()
#4 0x0000000002b41998 in ?? ()
#5 0x0000000000000001 in ?? ()
#6 0x0000000000000000 in ?? ()

Thread 5 (Thread 0x7fd82392c700 (LWP 24650)):
#0 0x00007fd822e7dde2 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1 0x000000000072a9e6 in ?? ()
#2 0x000000005f498e41 in ?? ()
#3 0x0000000010642c2d in ?? ()
#4 0x00000000001b773f in ?? ()
#5 0x0000000000e6de00 in ?? ()
#6 0x00000000001b773f in ?? ()
#7 0x000000006b49d1df in ?? ()
#8 0x0000000000000000 in ?? ()

Thread 4 (Thread 0x7fd8238ca700 (LWP 24652)):
#0 0x00007fd821d65eb3 in epoll_wait () from /lib64/libc.so.6
#1 0x000000000060556a in ?? ()
#2 0x0000000000000001 in ?? ()
#3 0x0000000002b38480 in ?? ()
#4 0x00000000ffffffff in ?? ()
#5 0x0000000000000001 in ?? ()
#6 0x00007fd8238c9b48 in ?? ()
#7 0x0000000000664b41 in ?? ()
#8 0x0000000000000000 in ?? ()

Thread 3 (Thread 0x7fd73f79c700 (LWP 25494)):
#0 0x00007fd821d5c9a3 in select () from /lib64/libc.so.6
#1 0x00000000006704f4 in ?? ()
#2 0x0000000000000000 in ?? ()

Thread 2 (Thread 0x7fd823889700 (LWP 26495)):
#0 0x00007fd822e7dde2 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1 0x000000000072a896 in ?? ()
#2 0x000000005f498991 in ?? ()
#3 0x0000000010644019 in ?? ()
#4 0x0000000002b2f338 in ?? ()
#5 0x0000000000000000 in ?? ()

Thread 1 (Thread 0x7fd823970900 (LWP 24647)):
#0 0x00007fd821d5c9a3 in select () from /lib64/libc.so.6
#1 0x00000000006704f4 in ?? ()
#2 0x0000000000000000 in ?? ()

Main thread:
#0 0x00007fd821d5c9a3 in select () from /lib64/libc.so.6
#1 0x00000000006704f4 in ?? ()
#2 0x0000000000000000 in ?? ()

Local variables:
No symbol table info available.
[Inferior 1 (process 24647) detached]
--- active threads ---
thd 0 (work_0), proto mysql, state query, command update
--- Totally 2 threads, and 1 client-working threads ---
------- CRASH DUMP END -------

@tomatolog
Copy link
Contributor

index data that cause the crash is at our FTP folder github-issue-403

@tomatolog
Copy link
Contributor

seems index RAM part got damaged and indextool checks only disk chunks. Could you truncate your index data and repopulate it from scratch to make sure the crash will vanish?

@damirt2020
Copy link
Author

Does this count as repopulate from scratch? I have a copy of the index that was loaded from our ETL process.

IMPORT TABLE dx1parts FROM '/var/lib/manticore_backup/dx1parts/dx1parts';

I can reload index like this and crash it with that query.

@tomatolog
Copy link
Contributor

no the data that you importing has issue with attribute that cause this crash. You have to issue inserts \ replace queries after index truncate - that is repopulate data from scratch.

@damirt2020
Copy link
Author

I loaded data from scratch to a different VM and the same issue happened. Is something in our data causing the crash? Any idea what field or row it might be?

------- FATAL: CRASH DUMP -------
[Sat Aug 29 20:36:16.224 2020] [ 966]

--- crashed SphinxQL request dump ---
update dx1parts set dealershipid='0' where dealershipid>0
--- request dump end ---
--- local index:dx1parts
Manticore 3.5.0 1d34c49@200722 release
Handling signal 11
-------------- backtrace begins here ---------------
Program compiled with 4.8.5
Configured with flags: Configured by CMake with these definitions: -DCMAKE_BUILD_TYPE=RelWithDebInfo -DDISTR_BUILD=rhel7 -DUSE_SSL=ON -DDL_UNIXODBC=1 -DUNIXODBC_LIB=libodbc.so.2 -DDL_EXPAT=1 -
DEXPAT_LIB=libexpat.so.1 -DUSE_LIBICONV=1 -DDL_MYSQL=1 -DMYSQL_LIB=libmysqlclient.so.18 -DDL_PGSQL=1 -DPGSQL_LIB=libpq.so.5 -DLOCALDATADIR=/var/data -DFULL_SHARE_DIR=/usr/share/manticore -DUSE
_RE2=1 -DUSE_ICU=1 -DUSE_BISON=ON -DUSE_FLEX=ON -DUSE_SYSLOG=1 -DWITH_EXPAT=1 -DWITH_ICONV=ON -DWITH_MYSQL=1 -DWITH_ODBC=ON -DWITH_PGSQL=1 -DWITH_RE2=1 -DWITH_STEMMER=1 -DWITH_ZLIB=ON -DGALERA
_SONAME=libgalera_manticore.so.31 -DSYSCONFDIR=/etc/manticoresearch
Host OS is Linux runner-fa6cab46-project-3858465-concurrent-0 4.19.78-coreos #1 SMP Mon Oct 14 22:56:39 -00 2019 x86_64 x86_64 x86_64 GNU/Linux
Stack bottom = 0x7f8298040cd0, thread stack size = 0x20000
Trying manual backtrace:
Something wrong with thread stack, manual backtrace may be incorrect (fp=0x591b10)
Wrong stack limit or frame pointer, manual backtrace failed (fp=0x591b10, stack=0x7f8298040000, stacksize=0x20000)
Trying system backtrace:
begin of system symbols:
/usr/bin/searchd(_Z12sphBacktraceib+0x90)[0x723f30]
/usr/bin/searchd(_ZN11CrashLogger11HandleCrashEi+0x1b2)[0x592182]
/lib64/libpthread.so.0(+0xf630)[0x7f82ac4dc630]
/usr/bin/searchd[0x8cfe20]
/usr/bin/searchd(_ZNK9CSphMatch13FetchAttrDataERK15CSphAttrLocatorPKhRi+0x14)[0x677d24]
/usr/bin/searchd(_ZNK14FilterString_c4EvalERK9CSphMatch+0x25)[0x847d75]
/usr/bin/searchd(_ZNK13CSphIndex_VLN12ScanByBlocksERK16CSphQueryContextP15CSphQueryResultiPP15ISphMatchSorterR9CSphMatchibbil+0x478)[0x690278]
/usr/bin/searchd(_ZNK13CSphIndex_VLN9MultiScanEPK9CSphQueryP15CSphQueryResultiPP15ISphMatchSorterRK18CSphMultiQueryArgs+0xaf0)[0x6aac00]
/usr/bin/searchd(_ZNK13CSphIndex_VLN10MultiQueryEPK9CSphQueryP15CSphQueryResultiPP15ISphMatchSorterRK18CSphMultiQueryArgs+0x6b8)[0x6ba9a8]
/usr/bin/searchd[0x883fa3]
/usr/bin/searchd[0x9607e7]
/usr/bin/searchd(_ZN7Threads10CoExecuteNEOSt8functionIFvvEEi+0x17f)[0x95c57f]
/usr/bin/searchd(_ZNK9RtIndex_c10MultiQueryEPK9CSphQueryP15CSphQueryResultiPP15ISphMatchSorterRK18CSphMultiQueryArgs+0x9e3)[0x8917e3]
/usr/bin/searchd(_ZNK9RtIndex_c12MultiQueryExEiPK9CSphQueryPP15CSphQueryResultPP15ISphMatchSorterRK18CSphMultiQueryArgs+0x77)[0x893427]
/usr/bin/searchd(_ZN15SearchHandler_c16RunLocalSearchesEv+0x62b)[0x5d0f8b]
/usr/bin/searchd(ZN15SearchHandler_c14RunActionQueryERK9CSphQueryRK10CSphStringPS3+0x1c8)[0x5dc6f8]
/usr/bin/searchd(_Z25HandleMySqlExtendedUpdateR14AttrUpdateArgs+0x7c)[0x5dcbcc]
/usr/bin/searchd(_ZN15CommitMonitor_c6UpdateER10CSphString+0xb8)[0x617338]
/usr/bin/searchd[0x626452]
/usr/bin/searchd[0x5b771d]
/usr/bin/searchd(Z20sphHandleMysqlUpdateR19StmtErrorReporter_iRK9SqlStmt_tRK10CSphStringRS4+0x213)[0x5de703]
/usr/bin/searchd(_ZN16CSphinxqlSession7ExecuteERK10CSphStringR11RowBuffer_i+0x7d6)[0x5fe916]
/usr/bin/searchd[0x66a6d8]
/usr/bin/searchd(_Z8SqlServe11SharedPtr_TIP13SockWrapper_c9Deleter_TIS1_L5ETYPE0EE16ISphRefcountedMTE+0x8e8)[0x66b488]
/usr/bin/searchd[0x666b2a]
/usr/bin/searchd(ZZN7Threads11CoRoutine_cC1ESt8functionIFvvEEmENUlN5boost7context6detail10transfer_tEE_4_FUNES7+0x17)[0x95cc07]
/usr/bin/searchd(make_fcontext+0x2f)[0x96218f]
-------------- backtrace ends here ---------------
Please, create a bug report in our bug tracker (https://github.com/manticoresoftware/manticore/issues)
and attach there:
a) searchd log, b) searchd binary, c) searchd symbols.
Look into the chapter 'Reporting bugs' in the documentation
(http://docs.manticoresearch.com/latest/html/reporting_bugs.html)
Dump with GDB via watchdog
[Sat Aug 29 20:36:16.369 2020] [965] watchdog: got USR1, performing dump of child's stack
Will run gdb on '/usr/bin/searchd', pid '966'
Error reading attached process's symbol file.
: No such file or directory.
Error reading attached process's symbol file.
: No such file or directory.
[New LWP 3768]
[New LWP 3827]
[New LWP 971]
[New LWP 969]
[New LWP 968]
[New LWP 967]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
0x00007f82ab3b79a3 in select () from /lib64/libc.so.6
Id Target Id Frame
7 Thread 0x7f82acfc9700 (LWP 967) "work_0" 0x00007f82ac4d8a35 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
6 Thread 0x7f82acfa8700 (LWP 968) "work_1" 0x00007f82ab3b79a3 in select () from /lib64/libc.so.6
5 Thread 0x7f82acf87700 (LWP 969) "TaskSched" 0x00007f82ac4d8de2 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
4 Thread 0x7f82acf45700 (LWP 971) "TickPool_0" 0x00007f82ab3c0eb3 in epoll_wait () from /lib64/libc.so.6
3 Thread 0x7f82acf66700 (LWP 3827) "sphTimer" 0x00007f82ab3b79a3 in select () from /lib64/libc.so.6
2 Thread 0x7f81bce0f700 (LWP 3768) "TaskW_127" 0x00007f82ac4d8de2 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0

  • 1 Thread 0x7f82acfcb900 (LWP 966) "searchd" 0x00007f82ab3b79a3 in select () from /lib64/libc.so.6

Thread 7 (Thread 0x7f82acfc9700 (LWP 967)):
#0 0x00007f82ac4d8a35 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1 0x000000000093f9d4 in ?? ()
#2 0x00007f82acfc8c5f in ?? ()
#3 0x0000000002d78800 in ?? ()
#4 0x0000000002d78868 in ?? ()
#5 0x0000000000000000 in ?? ()

Thread 6 (Thread 0x7f82acfa8700 (LWP 968)):
#0 0x00007f82ab3b79a3 in select () from /lib64/libc.so.6
#1 0x00000000006704f4 in ?? ()
#2 0x0000000000000002 in ?? ()
#3 0x00000000000ab621 in ?? ()
#4 0x00000000000000f5 in ?? ()
#5 0x0000000000723fb0 in ?? ()
#6 0x0000000000000000 in ?? ()

Thread 5 (Thread 0x7f82acf87700 (LWP 969)):
#0 0x00007f82ac4d8de2 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1 0x000000000072a9e6 in ?? ()
#2 0x000000005f4abd0a in ?? ()
#3 0x00000000275e70ec in ?? ()
#4 0x00000000000a61c5 in ?? ()
#5 0x0000000000e6de00 in ?? ()
#6 0x00000000000a61c5 in ?? ()
#7 0x00000000288deca7 in ?? ()
#8 0x0000000000000000 in ?? ()
Thread 4 (Thread 0x7f82acf45700 (LWP 971)):
#0 0x00007f82ab3c0eb3 in epoll_wait () from /lib64/libc.so.6
#1 0x000000000060556a in ?? ()
#2 0x0000000000000001 in ?? ()
#3 0x0000000002d6fd30 in ?? ()
#4 0x00000000ffffffff in ?? ()
#5 0x0000000000000001 in ?? ()
#6 0x00007f82acf44b48 in ?? ()
#7 0x0000000000664b41 in ?? ()
#8 0x0000000000000000 in ?? ()

Thread 3 (Thread 0x7f82acf66700 (LWP 3827)):
#0 0x00007f82ab3b79a3 in select () from /lib64/libc.so.6
#1 0x00000000006704f4 in ?? ()
#2 0x0000000000000000 in ?? ()

Thread 2 (Thread 0x7f81bce0f700 (LWP 3768)):
#0 0x00007f82ac4d8de2 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1 0x000000000072a896 in ?? ()
#2 0x000000005f4abcba in ?? ()
#3 0x00000000102e264b in ?? ()
#4 0x0000000002d66338 in ?? ()
#5 0x0000000000000000 in ?? ()

Thread 1 (Thread 0x7f82acfcb900 (LWP 966)):
#0 0x00007f82ab3b79a3 in select () from /lib64/libc.so.6
#1 0x00000000006704f4 in ?? ()
#2 0x0000000000000000 in ?? ()

Main thread:
#0 0x00007f82ab3b79a3 in select () from /lib64/libc.so.6
#1 0x00000000006704f4 in ?? ()
#2 0x0000000000000000 in ?? ()

Local variables:
No symbol table info available.
[Inferior 1 (process 966) detached]
--- active threads ---
thd 0 (work_1), proto mysql, state query, command update
--- Totally 2 threads, and 1 client-working threads ---
------- CRASH DUMP END -------

@tomatolog
Copy link
Contributor

in case you have insert \ replace stream of data that recreates the index that cause the crash we could check that stream and its resulting index to investigate the crash further

@githubmanticore githubmanticore added bug waiting Waiting for the original poster (in most cases) or something else labels Aug 31, 2020
@damirt2020
Copy link
Author

damirt2020 commented Aug 31, 2020

All right. I created an index from scratch, uploaded about 2.5M records (out of total 15M), run a query and crashed it.

I uploaded files to github-issue-403 folder. create_index.sql gives you the statement used to create the index and public_all_test.sql.gz contains insert statements for all records that you can use to populate the index.

I run the following to insert data: mysql -f -h 10.0.4.6 -P 9306 < public_all_test.sql (Note: there are a few rows with wrong encoding that will fail to load, hence the -f)

------- FATAL: CRASH DUMP -------
[Mon Aug 31 20:50:43.957 2020] [18527]

--- crashed SphinxQL request dump ---
update dx1parts set dealershipid='0' where dealershipid > 0
--- request dump end ---
--- local index:dx1parts
Manticore 3.5.0 1d34c49@200722 release
Handling signal 11
-------------- backtrace begins here ---------------
Program compiled with 4.8.5
Configured with flags: Configured by CMake with these definitions: -DCMAKE_BUILD_TYPE=RelWithDebInfo -DDISTR_BUILD=rhel7 -DUSE_SSL=ON -DDL_UNIXODBC=1 -DUNIXODBC_LIB=libodbc.so.2 -DDL_EXPAT=1 -DEXPAT_LIB=libexpat.so.1 -DUSE_LIBICONV=1 -DDL_MYSQL=1 -DMYSQL_LIB=libmysqlclient.so.18 -DDL_PGSQL=1 -DPGSQL_LIB=libpq.so.5 -DLOCALDATADIR=/var/data -DFULL_SHARE_DIR=/usr/share/manticore -DUSE_RE2=1 -DUSE_ICU=1 -DUSE_BISON=ON -DUSE_FLEX=ON -DUSE_SYSLOG=1 -DWITH_EXPAT=1 -DWITH_ICONV=ON -DWITH_MYSQL=1 -DWITH_ODBC=ON -DWITH_PGSQL=1 -DWITH_RE2=1 -DWITH_STEMMER=1 -DWITH_ZLIB=ON -DGALERA_SONAME=libgalera_manticore.so.31 -DSYSCONFDIR=/etc/manticoresearch
Host OS is Linux runner-fa6cab46-project-3858465-concurrent-0 4.19.78-coreos #1 SMP Mon Oct 14 22:56:39 -00 2019 x86_64 x86_64 x86_64 GNU/Linux
Stack bottom = 0x7f7dbc020990, thread stack size = 0x20000
Trying manual backtrace:
Something wrong with thread stack, manual backtrace may be incorrect (fp=0x591b10)
Wrong stack limit or frame pointer, manual backtrace failed (fp=0x591b10, stack=0x7f7dbc020000, stacksize=0x20000)
Trying system backtrace:
begin of system symbols:
/usr/bin/searchd(_Z12sphBacktraceib+0x90)[0x723f30]
/usr/bin/searchd(_ZN11CrashLogger11HandleCrashEi+0x1b2)[0x592182]
/lib64/libpthread.so.0(+0xf630)[0x7f7dd4c5b630]
/usr/bin/searchd[0x8cfe20]
/usr/bin/searchd(_ZNK9CSphMatch13FetchAttrDataERK15CSphAttrLocatorPKhRi+0x14)[0x677d24]
/usr/bin/searchd(_ZNK14FilterString_c4EvalERK9CSphMatch+0x25)[0x847d75]
/usr/bin/searchd(_ZNK13CSphIndex_VLN12ScanByBlocksERK16CSphQueryContextP15CSphQueryResultiPP15ISphMatchSorterR9CSphMatchibbil+0x478)[0x690278]
/usr/bin/searchd(_ZNK13CSphIndex_VLN9MultiScanEPK9CSphQueryP15CSphQueryResultiPP15ISphMatchSorterRK18CSphMultiQueryArgs+0xaf0)[0x6aac00]
/usr/bin/searchd(_ZNK13CSphIndex_VLN10MultiQueryEPK9CSphQueryP15CSphQueryResultiPP15ISphMatchSorterRK18CSphMultiQueryArgs+0x6b8)[0x6ba9a8]
/usr/bin/searchd[0x883fa3]
/usr/bin/searchd[0x9607e7]
/usr/bin/searchd(_ZN7Threads10CoExecuteNEOSt8functionIFvvEEi+0x17f)[0x95c57f]
/usr/bin/searchd(_ZNK9RtIndex_c10MultiQueryEPK9CSphQueryP15CSphQueryResultiPP15ISphMatchSorterRK18CSphMultiQueryArgs+0x9e3)[0x8917e3]
/usr/bin/searchd(_ZNK9RtIndex_c12MultiQueryExEiPK9CSphQueryPP15CSphQueryResultPP15ISphMatchSorterRK18CSphMultiQueryArgs+0x77)[0x893427]
/usr/bin/searchd(_ZN15SearchHandler_c16RunLocalSearchesEv+0x62b)[0x5d0f8b]
/usr/bin/searchd(ZN15SearchHandler_c14RunActionQueryERK9CSphQueryRK10CSphStringPS3+0x1c8)[0x5dc6f8]
/usr/bin/searchd(_Z25HandleMySqlExtendedUpdateR14AttrUpdateArgs+0x7c)[0x5dcbcc]
/usr/bin/searchd(_ZN15CommitMonitor_c6UpdateER10CSphString+0xb8)[0x617338]
/usr/bin/searchd[0x626452]
/usr/bin/searchd[0x5b771d]
/usr/bin/searchd(Z20sphHandleMysqlUpdateR19StmtErrorReporter_iRK9SqlStmt_tRK10CSphStringRS4+0x213)[0x5de703]
/usr/bin/searchd(_ZN16CSphinxqlSession7ExecuteERK10CSphStringR11RowBuffer_i+0x7d6)[0x5fe916]
/usr/bin/searchd[0x66a6d8]
/usr/bin/searchd(_Z8SqlServe11SharedPtr_TIP13SockWrapper_c9Deleter_TIS1_L5ETYPE0EE16ISphRefcountedMTE+0x8e8)[0x66b488]
/usr/bin/searchd[0x666b2a]
/usr/bin/searchd(ZZN7Threads11CoRoutine_cC1ESt8functionIFvvEEmENUlN5boost7context6detail10transfer_tEE_4_FUNES7+0x17)[0x95cc07]
/usr/bin/searchd(make_fcontext+0x2f)[0x96218f]
-------------- backtrace ends here ---------------
Please, create a bug report in our bug tracker (https://github.com/manticoresoftware/manticore/issues)
and attach there:
a) searchd log, b) searchd binary, c) searchd symbols.
Look into the chapter 'Reporting bugs' in the documentation
(http://docs.manticoresearch.com/latest/html/reporting_bugs.html)
Dump with GDB via watchdog
[Mon Aug 31 20:50:44.417 2020] [18526] watchdog: got USR1, performing dump of child's stack
Will run gdb on '/usr/bin/searchd', pid '18527'
--- active threads ---
thd 0 (work_1), proto mysql, state query, command update
--- Totally 2 threads, and 1 client-working threads ---
------- CRASH DUMP END -------

@sanikolaev
Copy link
Collaborator

Thank you. I could reproduce the crash on our dev server. We'll look into it.

@damirt2020
Copy link
Author

@sanikolaev Sergey, any updates on this issue? If you have any hints of what part of our data is causing it, we might be able to modify the data and load it differently. Please let me know if there is anything we can do to expedite the resolution.

@tomatolog
Copy link
Contributor

I've just pushed the fix in the codebase it will pass CI and then be packaged into dev repo that you might grab.

Or you might build daemon from source code master branch version e852474

@tomatolog tomatolog removed the waiting Waiting for the original poster (in most cases) or something else label Sep 8, 2020
@damirt2020
Copy link
Author

@sanikolaev Thanks Sergey. I tested the fix and it does not crash anymore. Appreciate the help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants