-
-
Notifications
You must be signed in to change notification settings - Fork 510
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Updating index causes a crash #403
Comments
index data that cause the crash is at our FTP folder github-issue-403 |
seems index RAM part got damaged and indextool checks only disk chunks. Could you truncate your index data and repopulate it from scratch to make sure the crash will vanish? |
Does this count as repopulate from scratch? I have a copy of the index that was loaded from our ETL process. IMPORT TABLE dx1parts FROM '/var/lib/manticore_backup/dx1parts/dx1parts'; I can reload index like this and crash it with that query. |
no the data that you importing has issue with attribute that cause this crash. You have to issue inserts \ replace queries after index truncate - that is repopulate data from scratch. |
I loaded data from scratch to a different VM and the same issue happened. Is something in our data causing the crash? Any idea what field or row it might be? ------- FATAL: CRASH DUMP ------- --- crashed SphinxQL request dump ---
Thread 7 (Thread 0x7f82acfc9700 (LWP 967)): Thread 6 (Thread 0x7f82acfa8700 (LWP 968)): Thread 5 (Thread 0x7f82acf87700 (LWP 969)): Thread 3 (Thread 0x7f82acf66700 (LWP 3827)): Thread 2 (Thread 0x7f81bce0f700 (LWP 3768)): Thread 1 (Thread 0x7f82acfcb900 (LWP 966)): Main thread: Local variables: |
in case you have insert \ replace stream of data that recreates the index that cause the crash we could check that stream and its resulting index to investigate the crash further |
All right. I created an index from scratch, uploaded about 2.5M records (out of total 15M), run a query and crashed it. I uploaded files to github-issue-403 folder. create_index.sql gives you the statement used to create the index and public_all_test.sql.gz contains insert statements for all records that you can use to populate the index. I run the following to insert data: mysql -f -h 10.0.4.6 -P 9306 < public_all_test.sql (Note: there are a few rows with wrong encoding that will fail to load, hence the -f) ------- FATAL: CRASH DUMP ------- --- crashed SphinxQL request dump --- |
Thank you. I could reproduce the crash on our dev server. We'll look into it. |
@sanikolaev Sergey, any updates on this issue? If you have any hints of what part of our data is causing it, we might be able to modify the data and load it differently. Please let me know if there is anything we can do to expedite the resolution. |
@sanikolaev Thanks Sergey. I tested the fix and it does not crash anymore. Appreciate the help. |
I am now able to reproduce the crash consistently and have the index file ready for you cleansed of any customer sensitive data. Basically, if you run this query, the searchd will crash right away
update dx1parts set dealershipid='0' where dealershipid <> '0';
The manticore folder containing index and manticore.conf are uploaded to your ftp server in github-issue-403 folder.
------- FATAL: CRASH DUMP -------
[Fri Aug 28 22:45:34.448 2020] [24647]
--- crashed SphinxQL request dump ---
update dx1parts set dealershipid='0' where dealershipid <> '0'
--- request dump end ---
--- local index:dx1parts
Manticore 3.5.0 1d34c49@200722 release
Handling signal 11
-------------- backtrace begins here ---------------
Program compiled with 4.8.5
Configured with flags: Configured by CMake with these definitions: -DCMAKE_BUILD_TYPE=RelWithDebInfo -DDISTR_BUILD=rhel7 -DUSE_SSL=ON -DDL_UNIXODBC=1 -DUNIXODBC_LIB=libodbc.so.2 -DDL_EXPAT=1 -DEXPAT_LIB=libexpat.so.1 -DUSE_LIBICONV=1 -DDL_MYSQL=1 -DMYSQL_LIB=libmysqlclient.so.18 -DDL_PGSQL=1 -DPGSQL_LIB=libpq.so.5 -DLOCALDATADIR=/var/data -DFULL_SHARE_DIR=/usr/share/manticore -DUSE_RE2=1 -DUSE_ICU=1 -DUSE_BISON=ON -DUSE_FLEX=ON -DUSE_SYSLOG=1 -DWITH_EXPAT=1 -DWITH_ICONV=ON -DWITH_MYSQL=1 -DWITH_ODBC=ON -DWITH_PGSQL=1 -DWITH_RE2=1 -DWITH_STEMMER=1 -DWITH_ZLIB=ON -DGALERA_SONAME=libgalera_manticore.so.31 -DSYSCONFDIR=/etc/manticoresearch
Host OS is Linux runner-fa6cab46-project-3858465-concurrent-0 4.19.78-coreos #1 SMP Mon Oct 14 22:56:39 -00 2019 x86_64 x86_64 x86_64 GNU/Linux
Stack bottom = 0x7fd804020b90, thread stack size = 0x20000
Trying manual backtrace:
Something wrong with thread stack, manual backtrace may be incorrect (fp=0x591b10)
Wrong stack limit or frame pointer, manual backtrace failed (fp=0x591b10, stack=0x7fd804020000, stacksize=0x20000)
Trying system backtrace:
begin of system symbols:
/usr/bin/searchd(_Z12sphBacktraceib 0x90)[0x723f30]
/usr/bin/searchd(_ZN11CrashLogger11HandleCrashEi 0x1b2)[0x592182]
/lib64/libpthread.so.0( 0xf630)[0x7fd822e81630]
/usr/bin/searchd[0x8cfe20]
/usr/bin/searchd(_ZNK9CSphMatch13FetchAttrDataERK15CSphAttrLocatorPKhRi 0x14)[0x677d24]
/usr/bin/searchd(_ZNK14FilterString_c4EvalERK9CSphMatch 0x25)[0x847d75]
/usr/bin/searchd(_ZNK10Filter_Not4EvalERK9CSphMatch 0xe)[0x84591e]
/usr/bin/searchd(_ZNK13CSphIndex_VLN12ScanByBlocksERK16CSphQueryContextP15CSphQueryResultiPP15ISphMatchSorterR9CSphMatchibbil 0x478)[0x690278]
/usr/bin/searchd(_ZNK13CSphIndex_VLN9MultiScanEPK9CSphQueryP15CSphQueryResultiPP15ISphMatchSorterRK18CSphMultiQueryArgs 0xaf0)[0x6aac00]
/usr/bin/searchd(_ZNK13CSphIndex_VLN10MultiQueryEPK9CSphQueryP15CSphQueryResultiPP15ISphMatchSorterRK18CSphMultiQueryArgs 0x6b8)[0x6ba9a8]
/usr/bin/searchd[0x883fa3]
/usr/bin/searchd[0x9607e7]
/usr/bin/searchd(_ZN7Threads10CoExecuteNEOSt8functionIFvvEEi 0x17f)[0x95c57f]
/usr/bin/searchd(_ZNK9RtIndex_c10MultiQueryEPK9CSphQueryP15CSphQueryResultiPP15ISphMatchSorterRK18CSphMultiQueryArgs 0x9e3)[0x8917e3]
/usr/bin/searchd(_ZNK9RtIndex_c12MultiQueryExEiPK9CSphQueryPP15CSphQueryResultPP15ISphMatchSorterRK18CSphMultiQueryArgs 0x77)[0x893427]
/usr/bin/searchd(_ZN15SearchHandler_c16RunLocalSearchesEv 0x62b)[0x5d0f8b]
/usr/bin/searchd(ZN15SearchHandler_c14RunActionQueryERK9CSphQueryRK10CSphStringPS3 0x1c8)[0x5dc6f8]
/usr/bin/searchd(_Z25HandleMySqlExtendedUpdateR14AttrUpdateArgs 0x7c)[0x5dcbcc]
/usr/bin/searchd(_ZN15CommitMonitor_c6UpdateER10CSphString 0xb8)[0x617338]
/usr/bin/searchd[0x626452]
/usr/bin/searchd[0x5b771d]
/usr/bin/searchd(Z20sphHandleMysqlUpdateR19StmtErrorReporter_iRK9SqlStmt_tRK10CSphStringRS4 0x213)[0x5de703]
/usr/bin/searchd(_ZN16CSphinxqlSession7ExecuteERK10CSphStringR11RowBuffer_i 0x7d6)[0x5fe916]
/usr/bin/searchd[0x66a6d8]
/usr/bin/searchd(_Z8SqlServe11SharedPtr_TIP13SockWrapper_c9Deleter_TIS1_L5ETYPE0EE16ISphRefcountedMTE 0x8e8)[0x66b488]
/usr/bin/searchd[0x666b2a]
/usr/bin/searchd(ZZN7Threads11CoRoutine_cC1ESt8functionIFvvEEmENUlN5boost7context6detail10transfer_tEE_4_FUNES7 0x17)[0x95cc07]
/usr/bin/searchd(make_fcontext 0x2f)[0x96218f]
-------------- backtrace ends here ---------------
Please, create a bug report in our bug tracker (https://github.com/manticoresoftware/manticore/issues)
and attach there:
a) searchd log, b) searchd binary, c) searchd symbols.
Look into the chapter 'Reporting bugs' in the documentation
(http://docs.manticoresearch.com/latest/html/reporting_bugs.html)
Dump with GDB via watchdog
[Fri Aug 28 22:45:34.679 2020] [24646] watchdog: got USR1, performing dump of child's stack
Will run gdb on '/usr/bin/searchd', pid '24647'
Error reading attached process's symbol file.
: No such file or directory.
Error reading attached process's symbol file.
: No such file or directory.
[New LWP 26495]
[New LWP 25494]
[New LWP 24652]
[New LWP 24650]
[New LWP 24649]
[New LWP 24648]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
0x00007fd821d5c9a3 in select () from /lib64/libc.so.6
Id Target Id Frame
7 Thread 0x7fd82396e700 (LWP 24648) "work_0" 0x00007fd821d5c9a3 in select () from /lib64/libc.so.6
6 Thread 0x7fd82394d700 (LWP 24649) "work_1" 0x00007fd822e7da35 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
5 Thread 0x7fd82392c700 (LWP 24650) "TaskSched" 0x00007fd822e7dde2 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
4 Thread 0x7fd8238ca700 (LWP 24652) "TickPool_0" 0x00007fd821d65eb3 in epoll_wait () from /lib64/libc.so.6
3 Thread 0x7fd73f79c700 (LWP 25494) "sphTimer" 0x00007fd821d5c9a3 in select () from /lib64/libc.so.6
2 Thread 0x7fd823889700 (LWP 26495) "TaskW_3" 0x00007fd822e7dde2 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
Thread 7 (Thread 0x7fd82396e700 (LWP 24648)):
#0 0x00007fd821d5c9a3 in select () from /lib64/libc.so.6
#1 0x00000000006704f4 in ?? ()
#2 0x0000000000000002 in ?? ()
#3 0x00000000000a34f3 in ?? ()
#4 0x00000000000000f5 in ?? ()
#5 0x0000000000723fb0 in ?? ()
#6 0x0000000000000000 in ?? ()
Thread 6 (Thread 0x7fd82394d700 (LWP 24649)):
#0 0x00007fd822e7da35 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1 0x000000000093f9d4 in ?? ()
#2 0x00007fd82394cc5f in ?? ()
#3 0x0000000002b41930 in ?? ()
#4 0x0000000002b41998 in ?? ()
#5 0x0000000000000001 in ?? ()
#6 0x0000000000000000 in ?? ()
Thread 5 (Thread 0x7fd82392c700 (LWP 24650)):
#0 0x00007fd822e7dde2 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1 0x000000000072a9e6 in ?? ()
#2 0x000000005f498e41 in ?? ()
#3 0x0000000010642c2d in ?? ()
#4 0x00000000001b773f in ?? ()
#5 0x0000000000e6de00 in ?? ()
#6 0x00000000001b773f in ?? ()
#7 0x000000006b49d1df in ?? ()
#8 0x0000000000000000 in ?? ()
Thread 4 (Thread 0x7fd8238ca700 (LWP 24652)):
#0 0x00007fd821d65eb3 in epoll_wait () from /lib64/libc.so.6
#1 0x000000000060556a in ?? ()
#2 0x0000000000000001 in ?? ()
#3 0x0000000002b38480 in ?? ()
#4 0x00000000ffffffff in ?? ()
#5 0x0000000000000001 in ?? ()
#6 0x00007fd8238c9b48 in ?? ()
#7 0x0000000000664b41 in ?? ()
#8 0x0000000000000000 in ?? ()
Thread 3 (Thread 0x7fd73f79c700 (LWP 25494)):
#0 0x00007fd821d5c9a3 in select () from /lib64/libc.so.6
#1 0x00000000006704f4 in ?? ()
#2 0x0000000000000000 in ?? ()
Thread 2 (Thread 0x7fd823889700 (LWP 26495)):
#0 0x00007fd822e7dde2 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1 0x000000000072a896 in ?? ()
#2 0x000000005f498991 in ?? ()
#3 0x0000000010644019 in ?? ()
#4 0x0000000002b2f338 in ?? ()
#5 0x0000000000000000 in ?? ()
Thread 1 (Thread 0x7fd823970900 (LWP 24647)):
#0 0x00007fd821d5c9a3 in select () from /lib64/libc.so.6
#1 0x00000000006704f4 in ?? ()
#2 0x0000000000000000 in ?? ()
Main thread:
#0 0x00007fd821d5c9a3 in select () from /lib64/libc.so.6
#1 0x00000000006704f4 in ?? ()
#2 0x0000000000000000 in ?? ()
Local variables:
No symbol table info available.
[Inferior 1 (process 24647) detached]
--- active threads ---
thd 0 (work_0), proto mysql, state query, command update
--- Totally 2 threads, and 1 client-working threads ---
------- CRASH DUMP END -------
The text was updated successfully, but these errors were encountered: