Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

REPLACE INTO leading to index corruption #569

Closed
pakud opened this issue Jun 5, 2021 · 5 comments
Closed

REPLACE INTO leading to index corruption #569

pakud opened this issue Jun 5, 2021 · 5 comments
Labels

Comments

@pakud
Copy link

pakud commented Jun 5, 2021

Describe the bug
after running attached file indextool --check informs about corrupted index

To Reproduce
Steps to reproduce the behavior:

get from https://mnt.cr/ftp content of github-issue-569 and unpack it.

config:

common {  
    plugin_dir = /usr/local/manticore/lib  
}  
  
index mantisRT  
{  
        hitless_words = all  
        dict=keywords  
        path = /var/lib/manticore/mantisRT  
        morphology = none  
        stopwords =  
        min_word_len = 1  
        min_prefix_len = 3  
        min_infix_len = 0  
  
        type = rt  
        rt_field = content  
        charset_table = 0..9, A..Z->a..z, _, a..z  
        blend_chars=!,",',#,$,%,&,(,),*, ,U 002C,-,@,?,_,[,],|,~,.  
}  
  
searchd {  
    listen = 127.0.0.1:9312  
    listen = 127.0.0.1:9306:mysql  
    listen = 127.0.0.1:9308:http  
    log = /var/log/manticore/searchd.log  
    query_log = /var/log/manticore/query.log  
    pid_file = /var/run/manticore/searchd.pid  
    query_log_format = sphinxql  
    seamless_rotate=1  
    rt_flush_period = 60  
    workers=threads  
}  

load the 'suspect' file [ uploadded via https://mnt.cr/ftp ] - it's full of REPLACE INTO with the last one having some binary garbage:

systemctl stop manticore.service ; rm -rf /var/lib/manticore/* ; systemctl start manticore.service ; echo 'truncate table mantisRT;'|mysql -P9306 -h127.0.0.1 ;  cat suspect|mysql -P9306 -h127.0.0.1 ; indextool -c /etc/manticoresearch/manticore.conf --check mantisRT  

Expected behavior

on-disk index file should be consistent, indextool should succed.

Describe the environment:

  • this behavior has been tested on cleanly set up vultr VMs with Debian 10 [ Linux pqdmanticoredeb 4.19.0-16-amd64 #1 SMP Debian 4.19.181-1 (2021-03-19) x86_64 GNU/Linux ]and Centos 8 [ Linux pqdmanticorecent 4.18.0-240.22.1.el8_3.x86_64 #1 SMP Thu Apr 8 19:01:30 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux ]
  • Manticore 3.6.0 96d61d8@210504 release for both Debian and Centos

Messages from log files:
there's nothing interesting in searchd.log; output of the indextool:

Manticore 3.6.0 96d61d8bf@210504 release  
Copyright (c) 2001-2016, Andrew Aksyonoff  
Copyright (c) 2008-2016, Sphinx Technologies Inc (http://sphinxsearch.com)  
Copyright (c) 2017-2021, Manticore Software LTD (https://manticoresearch.com)  
  
using config file '/etc/manticoresearch/manticore.conf'...  
checking index 'mantisRT'...  
checking schema...  
checking disk chunk, extension 0, 0(1)...  
checking schema...  
checking dictionary...  
FAILED, invalid docs/hits (pos=1, word=000002919f3e3881430bd39ccadd9d84, docs=-2147483647, hits=1)  
FAILED, invalid docs/hits (pos=42, word=00001032f8cbf9091d8fade5bfa1700c, docs=-2147483647, hits=1)  
FAILED, invalid docs/hits (pos=79, word=00001ae72e7589d62d679102a20ce02b, docs=-2147483647, hits=1)  
FAILED, invalid docs/hits (pos=115, word=00002c07fdb8440ea6ba9e94cf56d28e, docs=-2147483647, hits=1)  
FAILED, invalid docs/hits (pos=152, word=000035dd17a7cb9dba97b95835ba7480, docs=-2147483647, hits=1)  
FAILED, invalid docs/hits (pos=189, word=000043367415455275bf4c4fd624e512, docs=-2147483647, hits=1)  
FAILED, invalid docs/hits (pos=226, word=000049d9ab3e0803656f8ddfd0303121, docs=-2147483647, hits=1)  
FAILED, invalid docs/hits (pos=262, word=00004e3d8a06a5d4051a11a4cc25952f, docs=-2147483647, hits=1)  
FAILED, invalid docs/hits (pos=298, word=000051d527065833627b9a0da7c1e110, docs=-2147483647, hits=1)  
FAILED, invalid docs/hits (pos=335, word=00005300cf3083f94dc59bd4ca1dd5ab, docs=-2147483647, hits=1)  
FAILED, invalid docs/hits (pos=371, word=000053260b8c1b507cae5e4d2b343a0c, docs=-2147483647, hits=1)  
FAILED, invalid docs/hits (pos=406, word=0000545a3bdeb0ff3c50e1d9b3b66aa8, docs=-2147483647, hits=1)  
FAILED, invalid docs/hits (pos=442, word=000058cc5e2fb21c55a806fb2eb19135, docs=-2147483647, hits=1)  
FAILED, invalid docs/hits (pos=478, word=000059496a244b39d3ef0792e8144d3d, docs=-2147483647, hits=1)  
FAILED, invalid docs/hits (pos=514, word=00005a556934fdae99151adbb5ff5faa, docs=-2147483647, hits=1)  
FAILED, invalid docs/hits (pos=550, word=00005c5957e329ac23aa709cc9a3dc9f, docs=-2147483647, hits=1)  
FAILED, invalid docs/hits (pos=586, word=000069cb6a850969e7decff5b606eef9, docs=-2147483647, hits=1)  
FAILED, invalid docs/hits (pos=623, word=0000778c6e204b18eb58210ee0fd0a90, docs=-2147483647, hits=1)  
FAILED, invalid docs/hits (pos=660, word=000077f941df712e3bf56436ed76a436, docs=-2147483647, hits=1)  
FAILED, invalid docs/hits (pos=695, word=00007b0ed8466e4e120842f49c406cb4, docs=-2147483647, hits=1)  
FAILED, invalid docs/hits (pos=731, word=00007c3cbfc86523882873c81bceab8a, docs=-2147483647, hits=1)  
FAILED, invalid docs/hits (pos=768, word=000088781284bca827c4c1695fe424bb, docs=-2147483647, hits=1)  
FAILED, invalid docs/hits (pos=806, word=00008adc74175d5a690bee73f3253ef1, docs=-2147483647, hits=1)  
FAILED, invalid docs/hits (pos=843, word=00008b7e25d0cc44b16ca4d0902dc520, docs=-2147483647, hits=1)  
FAILED, invalid docs/hits (pos=880, word=00008d44fbdbb7d7bb316cddc6d16b71, docs=-2147483647, hits=1)  
FAILED, invalid docs/hits (pos=917, word=0000946c6bc9004bf64a9f9252330a2f, docs=-2147483647, hits=1)  
FAILED, invalid docs/hits (pos=955, word=0000ac356a8eb3cc80cd5389b2dad580, docs=-2147483647, hits=1)  
FAILED, invalid docs/hits (pos=993, word=0000b970942afcc5ff154de22720bd67, docs=-2147483647, hits=1)  
FAILED, invalid docs/hits (pos=1031, word=0000c016e2aff4cced5f4434e7a560dd, docs=-2147483647, hits=1)  
FAILED, invalid docs/hits (pos=1069, word=0000d14eb06cb08d681d93b95daea685, docs=-2147483647, hits=1)  
FAILED, invalid docs/hits (pos=1107, word=0000d1fc5686f5b87c1813c77d358884, docs=-2147483647, hits=1)  
FAILED, invalid docs/hits (pos=1143, word=0000d73f35bb6f136668ff8190ef9c41, docs=-2147483647, hits=1)  
FAILED, invalid docs/hits (pos=1180, word=0000d7d2513740464ef9a11734ba0969, docs=-2147483647, hits=1)  
FAILED, invalid docs/hits (pos=1216, word=0000e954798885c66c9c7e3c885f62e0, docs=-2147483647, hits=1)  
FAILED, invalid docs/hits (pos=1254, word=0000ebb6b8d9011328c537dc51f751fb, docs=-2147483647, hits=1)  
FAILED, invalid docs/hits (pos=1291, word=0000ef72069ec6fcaf8dd1d47c14ea57, docs=-2147483647, hits=1)  
FAILED, invalid docs/hits (pos=1328, word=0000f3ec71ce5a28f78545a2246ac94b, docs=-2147483647, hits=1)  
FAILED, invalid docs/hits (pos=1366, word=0000f7f8611f2380987ca578a26ad153, docs=-2147483647, hits=1)  
FAILED, invalid docs/hits (pos=1403, word=0000fae11c01f044112899c9b442ce11, docs=-2147483647, hits=1)  
FAILED, invalid docs/hits (pos=1440, word=0001054f1f10466d31e3f5f295dd5064, docs=-2147483647, hits=1)  
FAILED, invalid docs/hits (pos=1479, word=00010a44952858495aa59ec18c9733dc, docs=-2147483647, hits=1)  
FAILED, invalid docs/hits (pos=1516, word=00010ca1603856b9f4906bcd5b0ee9a6, docs=-2147483647, hits=1)  
FAILED, invalid docs/hits (pos=1553, word=00010ced12ce436dc283fda20adc8274, docs=-2147483647, hits=1)  
FAILED, invalid docs/hits (pos=1589, word=00010eb087e907dc333953d950dde6f4, docs=-2147483647, hits=1)  
FAILED, invalid docs/hits (pos=1626, word=000110516f04f3a93453d0732a81bd01, docs=-2147483647, hits=1)  
FAILED, invalid docs/hits (pos=1664, word=000110e8bfa662a2a4f87491c518f79c, docs=-2147483647, hits=1)  
FAILED, invalid docs/hits (pos=1700, word=0001213c92b1e827f34df5fda02cfe9c, docs=-2147483647, hits=1)  
FAILED, invalid docs/hits (pos=1738, word=0001220c93102dcdff48260024f54204, docs=-2147483647, hits=1)  
FAILED, invalid docs/hits (pos=1775, word=0001394c23dc13b83046778423567387, docs=-2147483647, hits=1)  
FAILED, invalid docs/hits (pos=1813, word=000141339ff48272c61cefeb7ed9767b, docs=-2147483647, hits=1)  
FAILED, invalid docs/hits (pos=1851, word=00014b5462a83e74c4525fe01f14f34d, docs=-2147483647, hits=1)  
FAILED, invalid docs/hits (pos=1888, word=00015b33ab66ef37d8009986e7fc31e1, docs=-2147483647, hits=1)  
FAILED, invalid docs/hits (pos=1926, word=00015b5ae1d50858c4b6e80d69638c37, docs=-2147483647, hits=1)  
FAILED, invalid docs/hits (pos=1962, word=00015bcd5cf3f8564ed776b2cbead216, docs=-2147483647, hits=1)  
FAILED, invalid docs/hits (pos=1998, word=0001610c2c51ca224f00ddf8081895f7, docs=-2147483647, hits=1)  
FAILED, invalid docs/hits (pos=2036, word=000176bdd62c03355f11188d535ff8d7, docs=-2147483647, hits=1)  
FAILED, invalid docs/hits (pos=2074, word=00018811ad1f2ddf232016bd84e9f86e, docs=-2147483647, hits=1)  
FAILED, invalid docs/hits (pos=2112, word=000188863459abf3ab2c327487eb81f9, docs=-2147483647, hits=1)  
FAILED, invalid docs/hits (pos=2148, word=00018a4899340fa84febc60139fcf262, docs=-2147483647, hits=1)  
FAILED, invalid docs/hits (pos=2185, word=00019c8abefe630ea3f5757a03fb946e, docs=-2147483647, hits=1)  
FAILED, invalid docs/hits (pos=2223, word=0001a1d12300a8dccb8225c1b8e06b51, docs=-2147483647, hits=1)  
FAILED, invalid docs/hits (pos=2261, word=0001a69498f50a217f861c0b189cbd71, docs=-2147483647, hits=1)  
FAILED, invalid docs/hits (pos=2298, word=0001a6d38d2c0ebe91b4f35597e3a196, docs=-2147483647, hits=1)  
FAILED, invalid docs/hits (pos=2334, word=0001ac7811dceaa924e6ec8df5d1049e, docs=-2147483647, hits=1)  
FAILED, invalid docs/hits (pos=2373, word=0001b31d5ae0b4d844f4a08331485dc8, docs=-2147483647, hits=1)  
FAILED, invalid docs/hits (pos=2415, word=0001c3ec3621fe37ed63e3fe995cf4c3, docs=-2147483647, hits=1)  
FAILED, invalid docs/hits (pos=2453, word=0001c779672430b8bd856afbe41370c2, docs=-2147483647, hits=1)  
FAILED, invalid docs/hits (pos=2490, word=0001e51f454402eda35e66563ec2c031, docs=-2147483647, hits=1)  
FAILED, invalid docs/hits (pos=2528, word=0001e5f94eaafcd1306c093b1ec70e5e, docs=-2147483647, hits=1)  
FAILED, invalid docs/hits (pos=2564, word=0001ed56eaa45305c9dd05555f073d5c, docs=-2147483647, hits=1)  
FAILED, invalid docs/hits (pos=2601, word=0001f26086ff6fccb8e773d2e631b22e, docs=-2147483647, hits=1)  
FAILED, invalid docs/hits (pos=2639, word=0002039544397ee706137d42c55946aa, docs=-2147483647, hits=1)  
FAILED, invalid docs/hits (pos=2678, word=0002053705f93cdb2c23b02209351f2f, docs=-2147483647, hits=1)  
FAILED, invalid docs/hits (pos=2715, word=00020bce4f592290298b74b42133c4a0, docs=-2147483647, hits=1)  
FAILED, invalid docs/hits (pos=2752, word=000211d499e24791a70e4a97896abda2, docs=-2147483647, hits=1)  
FAILED, invalid docs/hits (pos=2790, word=0002161589713cbadcbdd02f9fb9b39e, docs=-2147483647, hits=1)  
FAILED, invalid docs/hits (pos=2827, word=00021b31e33480a49c7b9427f2434873, docs=-2147483647, hits=1)  
FAILED, invalid docs/hits (pos=2864, word=00023067ae6e332a4032c2130d446d2a, docs=-2147483647, hits=1)  
FAILED, invalid docs/hits (pos=2902, word=0002362d6fc32248fe2c4e176ea3e5fc, docs=-2147483647, hits=1)  
FAILED, invalid docs/hits (pos=2939, word=0002395b57e37ac0df444a0984991060, docs=-2147483647, hits=1)  
FAILED, invalid docs/hits (pos=2976, word=00023da6690e49eb303cc73ea98bc397, docs=-2147483647, hits=1)  
FAILED, invalid docs/hits (pos=3013, word=000241a6b71ecdbdb381ad1f0a37ee97, docs=-2147483647, hits=1)  
FAILED, invalid docs/hits (pos=3051, word=0002549618b90dfea770eff117a485fb, docs=-2147483647, hits=1)  
FAILED, invalid docs/hits (pos=3089, word=00025d6a2ff4a0b54007e922737a44cf, docs=-2147483647, hits=1)  
FAILED, invalid docs/hits (pos=3126, word=00026ca6e925d615f8cf13150a4a2a3f, docs=-2147483647, hits=1)  
FAILED, invalid docs/hits (pos=3164, word=000273be3e14f22ce5b66ca64adcb303, docs=-2147483647, hits=1)  
FAILED, invalid docs/hits (pos=3202, word=000278a503c74949399c12f18ed5e9a8, docs=-2147483647, hits=1)  
FAILED, invalid docs/hits (pos=3239, word=00027a823d38c4719b08df190450d23f, docs=-2147483647, hits=1)  
FAILED, invalid docs/hits (pos=3276, word=000287d264bb7cf8ab3886b2e4a76127, docs=-2147483647, hits=1)  
FAILED, invalid docs/hits (pos=3314, word=00028c289606fd4f1ba85ff27c5fed93, docs=-2147483647, hits=1)  
FAILED, invalid docs/hits (pos=3351, word=00028ce65279eadf597fba380f92ab9d, docs=-2147483647, hits=1)  
FAILED, invalid docs/hits (pos=3387, word=00029727171b0c6928338f4986c55cf4, docs=-2147483647, hits=1)  
FAILED, invalid docs/hits (pos=3425, word=00029953d94592719c28eba1651af2ee, docs=-2147483647, hits=1)  
FAILED, invalid docs/hits (pos=3462, word=00029a3a1e0a22505a52ed88680b9e2c, docs=-2147483647, hits=1)  
FAILED, invalid docs/hits (pos=3499, word=0002b215484019f3a102c1b5676b8da4, docs=-2147483647, hits=1)  
FAILED, invalid docs/hits (pos=3537, word=0002b3d65a100368b039340110a324f2, docs=-2147483647, hits=1)  
FAILED, invalid docs/hits (pos=3574, word=0002c4d69240a4c1d1f243003e50ed5b, docs=-2147483647, hits=1)  
FAILED, invalid docs/hits (pos=3612, word=0002cb58d01b5960d4e4a18497507e28, docs=-2147483647, hits=1)  
FAILED, invalid docs/hits (pos=3649, word=0002cc2b6791624402a9b3bde45dac71, docs=-2147483647, hits=1)  
checking data...  
checking rows...  
checking attribute blocks index...  
checking kill-list...  
checking dead row map...  
checking doc-id lookup...  
check FAILED, 99 of 4939314 failures reported, 5.9 sec elapsed  
check passed, 5.9 sec elapsed  

Additional context

i came up with this artificial example while working on sanitizing data from #566

it's the last line of the 'suspect' file that leads to the problem but it alone - without the previous ones - is not enough to
reproduce it

@sanikolaev
Copy link
Collaborator

Thank you for the reproducible case @pakud . I've reproduced it on our side.

@sanikolaev sanikolaev added the bug label Jun 7, 2021
@githubmanticore
Copy link
Contributor

➤ Aleksey N. Vinogradov commented:

Investigation steps performed:

  1. Shrink dataset
  • Original dump is not possible to shrink significantly keeping error (it just disappear on quite big N of replaces which is still too high for manual research).
  • However it seems that error observed over the disk chunk, and less replaces just not hit mem-limit, so with few inserts there is no disk chunk created.
  • I've appended manual 'flush ramchunk' at the end of replaces - in order to force disk chunk.
  • That helped to keep dump-shrinking, keeping error observable.
  • by dividing /2 I've finished with the fact, that only 1 replace + flush ramchunk produces erroneous disk-chunk. Good!
  1. Shrink statement
  • by shrinking that one statement I've finished with reproductable case from one token. Good!
  1. Shrink token
  • by shortening the token, I've finished with reproductable case with token consist from 1 letter.

So, minimal reproducable case found, that is:

REPLACE INTO mantisRT(id,content) VALUES(1,'b'); 
flush ramchunk mantisRT; 

Now it is time to fix it.
From the first glance it m.b. even error in indextool and not in the index itself, but let's check first.

@githubmanticore
Copy link
Contributor

➤ Aleksey N. Vinogradov commented:

(issue was introduced in d38409e)

@sanikolaev
Copy link
Collaborator

Closing as fixed.

@sanikolaev
Copy link
Collaborator

Fixed in 47b9aeb

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants