-
-
Notifications
You must be signed in to change notification settings - Fork 512
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Lost connection to MySQL server during query #325
Comments
It happens everytime I execute this query. The same error occurs when executing the following query:
|
➤ Sergey Nikolaev commented: Hi I can't reproduce it on a test index:
So if possible - can you upload your index to our write-only FTP server?
It will be very helpful to debug the issue. |
Done. |
seems your RT index got changed since the crash because I tried to reproduce the crash on data you uploaded and see no issue - got correct reply
I need same index that causes the crash or a way to reproduce the crash here locally. |
No, it hasn't. What additional info I can provide you with so that you can reproduce the issue? |
@tomatolog Maybe there is difference in config files? Can you please provide your searchd config? I'll try it on our server. |
here is config you provided with only search section from me
|
could you provide or upload searchd.log ? maybe there are many different crashes or query logged these cause daemon crashes? |
could you also provide your box OS and package that you uses? Could you create case at Docker container that crashes daemon and upload it into our FTP? |
@tomatolog Uploaded to the same folder.
Linux dev-search.local 3.10.0-1062.18.1.el7.x86_64 #1 SMP Tue Mar 17 23:49:17 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
We're working on it. |
JFYI, the same index configuration works properly in Sphinx 3.2.1 on the same server. |
from log you provided I see that a lot of update were replayed from binlog after every daemon restarts. Did you upload your index after clean daemon shutdown? Otherwise index might be not in the state that causes crash. |
Yes, I stopped the Manticore service, put index data to /data/sphinx and cleared binlogs. After this I started the service. Almost all queries are executed properly to this index, except the problematic ones. I've uploaded *.vdi images to your FTP server (dev-search.zip). The problem is reproduced there. root pass is 1234. |
what OS should I use for your vdi file? |
Linux 64-bit. There is CentOS inside. |
there is two vdi inside should I mount both? or specific one? |
Mount both |
I checked VM you provided and reproduced crash with this data. However I checked that index with indextool and see that disk chunk of RT index is damaged
that is why daemon crashed on using such invalid string attribute. I need a way to investigate issue how index got into such state. |
I just start filling the index and it breaks at some state everytime. The same process works properly with the latest version of Sphinx. I can fill the index from the scratch and provide you with query.log and searchd.log after that. Will it help you? |
no query.log is not appropriate as it contains only select / search queries and we can not recreate index from it. |
Ok, I'll try to make such file in 1-2 days. I'll upload it to the server and let you know about it. |
@tomatolog Uploaded to /issue-325/log.sql.zip |
I truncated your index when inserted data you provided there as And after populate finished checked index with indextool and see no issue that index from VM has. I need not a raw data or already broken index I need a way to recreate how index become broken. Maybe there should be script that insert data from stream in N parallel threads or interleave populate with searches or OPTIMIZE commands. Do you have thoughts how to get broken index? |
@tomatolog Ok, I have uploaded extended sql-file to the server (/issue-325/log_extended.sql.gz) What I did to reproduce the problem:
After this I got ERROR 2013 (HY000): Lost connection to MySQL server during query. Note that this problem is reproduced under VM with using vdi images uploaded earlier. |
ok I will try at native box then at VM you provided in case native build will show no issue |
@tomatolog Any updates? |
@tomatolog Hello again! Did you reproduced the issue with the provided info? |
@tomatolog Unfortunately, it doesn't work. It still fails on the following query:
Server version: 3.4.3 ab7cbe5@200511 dev |
➤ Stan commented: could you provide crash log from new build? |
[Wed May 13 08:06:46.914 2020] [1270] rt: index sku_20200317: diskchunk 13(1), segments 32 saved in 12.928 sec --- crashed SphinxQL request dump ---
Thread 8 (Thread 0x7f6680780700 (LWP 967)): Thread 7 (Thread 0x7f6677114700 (LWP 968)): Thread 6 (Thread 0x7f6676f0a700 (LWP 1289)): Thread 5 (Thread 0x7f6676ace700 (LWP 1290)): Thread 4 (Thread 0x7f66769c9700 (LWP 1291)): Thread 3 (Thread 0x7f66768c4700 (LWP 1292)): Thread 2 (Thread 0x7f667700f700 (LWP 1340)): Thread 1 (Thread 0x7f66807828c0 (LWP 966)): Main thread: Local variables: wget https://codeload.github.com/manticoresoftware/manticoresearch/zip/ab7cbe5 -O manticore.zip Unpack the sources by command: For comfortable debug also suggest to append a substitution def to your ~/.gdbinit file: |
could you upload this index (where crash happens) to our FTP? |
@tomatolog Uploaded to /issue-325/sku(2020May13).tar.gz |
got correct reply for your recent index from archive
daemon runs under valgrind checker shows no errors. Could you upload to our FTP your package where daemon will crash along with debug package? or daemon binary file along with symbol file for it? For binary I need not only daemon and symbols for it but also and indexer binary to check what cmake options do you use for build. As for daemon that I build (Manticore 3.4.3 ab7cbe5@200511 dev) I see no crash you describe. |
We will try again on a clean Manticore installation when new version is released and will let you know the result. @tomatolog BTW, which flags should be used for building the daemon? Maybe we can try to rebuid it properly. |
here is my output of binaries I tried at dev box with your data
|
Here is a rhel7 packages that I build from mater version at our CI pipeline we use for regular release. Could you check it? |
Still no luck. I'll wait for the next release version and try there. If the problem is fixed, I will closed the issue. |
You can also check one of these https://repo.manticoresearch.com/#browse/browse:dev:release%2Fcentos%2F7 The most recent at the moment is of June 8 |
@sanikolaev We have tried it using your latest dev-release, it fails as well. I've uploaded the file to your server (/issue-325/vagrant-pack.tar.gz). Please follow the instructions in the README.md file and you will reproduce the error. |
Thanks! We'll take a look into it. |
I installed Vagrant and Ansible and VirtaulBox at VPS however it failed to start with following error message
|
if I replayed your data.sql at raw metal box with daemon installed I got correct reply
|
Not sure why you experience this error. We checked it on MacOS and Ubuntu with different versions of Ansible, Vagrant and VBox. It works properly. Could you please check it using another environment? Thank you in advance. |
in case you still uses Virtual Box could you issue your steps but prior last step there daemon got crashed in I could import that VM into Virtual Box and check the crash without posting data or any setup. |
Uploaded to /issue-325/vagrant-pack_default_1592323550793_68799.ova Just run the machine and execute: |
Pavel, |
I reproduced the issue with VM image you provided. The version installed crashes on inserting data.
with following stack
I will investigate the issue and inform you on fix |
Hello! I had done it twice, however, it did not help you to find the cause of the problem. It's not hard for me to provide the log info. But I think it would be better if you reproduce the problem by yourselves. The provided VB snapshot will certainly help you. Just import it and restore the dump, and you will see the problem. |
Good to hear that! I look forward to your reply! |
do you have crashes similar to these not inside VM? Do you work mostly with VM or metal boxes (VPS)? As inside VM you provided I have crashes on insert or after insert finishes on select but all crashes happens at different places and it is still hard to get stable reproduction and debug the root of the crash as I still have no single point there crashes happened. However at metal box or VPS (centos7, ubuntu) I see no crashes either at insert nor at select. That is why I ask you about the crashes at your environment. |
Yes, we have the crash reproduced on the VPS server running on CentOS. We built this VM image similar to our VPS environment just to provide you with it. |
Hello there! Any news? @tomatolog |
Hello @glukkkk. We can reproduce the issue with some chance, but only in the VM you provided and cannot reproduce it neither on a bare metal server running centos/ubuntu nor in Docker or in a Hetzner VPS running Centos 7. The problem seems to be too specific and is taking too much time. If it's mission critical for you please consider using our professional services https://manticoresearch.com/services , you can then perhaps give us access to your server so we can do debugging write there. |
Thank you for your time and patience. We have installed the latest version 3.5.0 and the problem is not reproduced there anymore. It seems that some of your commits fixed it! |
--- crashed SphinxQL request dump ---
select count(distinct un_sku_id) From sku_20200317
--- request dump end ---
--- local index:sku_20200317
Manticore 3.4.0 b212975@200327 release
Handling signal 11
-------------- backtrace begins here ---------------
Program compiled with 4.8.5
Configured with flags: Configured by CMake with these definitions: -DCMAKE_BUILD_TYPE=RelWithDebInfo -DDISTR_BUILD=rhel7 -DUSE_SSL=ON -DDL_UNIXODBC=1 -DUNIXODBC_LIB=libodbc.so.2 -DDL_EXPAT=1 -DEXPAT_LIB=libexpat.so.1 -DUSE_LIBICONV=1
-DDL_MYSQL=1 -DMYSQL_LIB=libmysqlclient.so.18 -DDL_PGSQL=1 -DPGSQL_LIB=libpq.so.5 -DLOCALDATADIR=/var/data -DFULL_SHARE_DIR=/usr/share/manticore -DUSE_ICU=1 -DUSE_BISON=ON -DUSE_FLEX=ON -DUSE_SYSLOG=1 -DWITH_EXPAT=1 -DWITH_ICONV=ON
-DWITH_MYSQL=1 -DWITH_ODBC=ON -DWITH_PGSQL=1 -DWITH_RE2=1 -DWITH_STEMMER=1 -DWITH_ZLIB=ON -DGALERA_SOVERSION=31 -DSYSCONFDIR=/etc/manticoresearch
Host OS is Linux runner-ed2dce3a-project-3858465-concurrent-0 4.19.78-coreos #1 SMP Mon Oct 14 22:56:39 -00 2019 x86_64 x86_64 x86_64 GNU/Linux
Stack bottom = 0x7fd8e7982d7f, thread stack size = 0x100000
Trying manual backtrace:
Something wrong with thread stack, manual backtrace may be incorrect (fp=0xc)
Wrong stack limit or frame pointer, manual backtrace failed (fp=0xc, stack=0x7fd8e7980000, stacksize=0x100000)
Trying system backtrace:
begin of system symbols:
searchd(_Z12sphBacktraceib 0x90)[0x710e20]
searchd(_ZN16SphCrashLogger_c11HandleCrashEi 0x1fe)[0x58409e]
/lib64/libpthread.so.0( 0xf5f0)[0x7fd8f8c375f0]
searchd[0x8987c0]
searchd(_Z14sphGetBlobAttrRK9CSphMatchRK15CSphAttrLocatorPKh 0x22)[0x899f02]
searchd(_ZNK9CSphMatch13FetchAttrDataERK15CSphAttrLocatorPKh 0x10)[0x665fa0]
searchd[0x71efc6]
searchd(_ZN23CSphImplicitGroupSorterI16MatchGeneric2_fnLb1ELb0EE6MoveToEP15ISphMatchSorter 0xc9)[0x7669e9]
searchd(_Z15FlattenToSorterP15ISphMatchSorter11VecTraits_TIS0_E 0x30)[0x84b080]
searchd(_ZN13Tls_context_c8FinalizeEv 0x1e1)[0x870711]
searchd(_Z15QueryDiskChunksPK9CSphQueryP15CSphQueryResultRK18CSphMultiQueryArgsR15SphChunkGuard_tR11VecTraits_TIP15ISphMatchSorterEP16CSphQueryProfilebPK15CSphOrderedHashIl10CSphString15CSphStrHashFuncLi256EElPKcRS9_IPKhEl 0x49a)[0x85d52a]
searchd(_ZNK9RtIndex_c10MultiQueryEPK9CSphQueryP15CSphQueryResultiPP15ISphMatchSorterRK18CSphMultiQueryArgs 0x45f)[0x85f7af]
searchd(_ZNK9RtIndex_c12MultiQueryExEiPK9CSphQueryPP15CSphQueryResultPP15ISphMatchSorterRK18CSphMultiQueryArgs 0x77)[0x861517]
searchd(_ZN15SearchHandler_c16RunLocalSearchesEv 0x76a)[0x5bc57a]
searchd(_ZN15SearchHandler_c9RunSubsetEii 0xca7)[0x5ced77]
searchd(_ZN15SearchHandler_c10RunQueriesEv 0xb5)[0x5cfaf5]
searchd(_Z17HandleMysqlSelectR11RowBuffer_iR15SearchHandler_c 0x1b0)[0x5d0180]
searchd(_ZN16CSphinxqlSession7ExecuteERK10CSphStringR11RowBuffer_iRN7Threads9ThdDesc_tE 0x1411)[0x5f45c1]
searchd(_Z15LoopClientMySQLRhR16CSphinxqlSessionR10CSphStringibRN7Threads9ThdDesc_tER13InputBuffer_cR16ISphOutputBuffer 0x322)[0x5d1022]
searchd[0x5d13cb]
searchd(_Z17HandlerThreadFuncPv 0x19)[0x5d3519]
searchd(_ZN16SphCrashLogger_c13ThreadWrapperEPv 0x43)[0x583cd3]
searchd(_Z20sphThreadProcWrapperPv 0x23)[0x7149a3]
/lib64/libpthread.so.0( 0x7e65)[0x7fd8f8c2fe65]
/lib64/libc.so.6(clone 0x6d)[0x7fd8f744688d]
-------------- backtrace ends here ---------------
The text was updated successfully, but these errors were encountered: