You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Highlight very slow performance (x28) on remote index and returning empty values when fieldname without quotes specified in HIGHLIGHT({around=5},msg)
#1158
Open
popalot2 opened this issue
Jun 7, 2023
· 2 comments
When using HIGHLIGHT on a remote index (a wrapper around a local index), the performance is ~x28 times worst as compared to the
same local index.
When using HIGHLIGHT({around=5},msg) (with field name) on remote index, empty string is returned.
To Reproduce
create index, adjust path to correct location
source src_highlight_performance_hit
{
type = csvpipe
csvpipe_command= awk 'BEGIN {srand(); for (i = 1; i <= 10000; i ) {printf int(rand() * 10000000) ",2,3,aaa,Led Zeppelin "; system("head -n 2000 /usr/share/dict/words | shuf| head -n 1000|tr -c \"[:alnum:]\" \" \""); print ""}}'
csvpipe_attr_uint=f1
csvpipe_attr_uint=f2
csvpipe_field=s1
csvpipe_field = msg
}
index idx_highligh_performance_hit
{
stored_fields = msg
source = src_highlight_performance_hit
path = /var/manticore/idx_highligh_performance_hit
index_exact_words=1
min_prefix_len = 0
docstore_compression = none
ngram_len = 1
html_strip = 1
}
index idx_highligh_performance_hit_local
{
local = idx_highligh_performance_hit
type = distributed
}
index idx_highligh_performance_hit_remote
{
agent = 127.0.0.1:9312:idx_highligh_performance_hit
type = distributed
}
prepare index
indexer idx_highligh_performance_hit --rotate
run SQL, observe that highlight is empty when using HIGHLIGHT({around=5},msg) on remote index
#works ok
mysql --protocol=tcp -h localhost -P 9306 -u aaa -ss -e "SELECT id ,HIGHLIGHT({around=5},msg) h, RAND() r FROM idx_highligh_performance_hit WHERE match('Led') ORDER BY r DESC LIMIT 0,1"
#highlight empty from remote index, no error
mysql --protocol=tcp -h localhost -P 9306 -u aaa -ss -e "SELECT id ,HIGHLIGHT({around=5},msg) h, RAND() r FROM idx_highligh_performance_hit_remote WHERE match('Led') ORDER BY r DESC LIMIT 0,1"
#works ok
mysql --protocol=tcp -h localhost -P 9306 -u aaa -ss -e "SELECT id ,HIGHLIGHT({around=5},'msg') h, RAND() r FROM idx_highligh_performance_hit WHERE match('Led') ORDER BY r DESC LIMIT 0,1"
mysql --protocol=tcp -h localhost -P 9306 -u aaa -ss -e "SELECT id ,HIGHLIGHT({around=5},'msg') h, RAND() r FROM idx_highligh_performance_hit WHERE match('Led') ORDER BY r DESC LIMIT 0,1"
results:
mysql --protocol=tcp -h localhost -P 9306 -u aaa -ss -e "SELECT id ,HIGHLIGHT({around=5},msg) h, RAND() r FROM idx_highligh_performance_hit WHERE match('Led') ORDER BY r DESC LIMIT 0,1"
567002 <b>Led</b> Zeppelin absconce abyssopelagic absolutista absurdness ... 0.99935228
mysql --protocol=tcp -h localhost -P 9306 -u aaa -ss -e "SELECT id ,HIGHLIGHT({around=5},msg) h, RAND() r FROM idx_highligh_performance_hit_remote WHERE match('Led') ORDER BY r DESC LIMIT 0,1"
698081 0.99990314
mysql --protocol=tcp -h localhost -P 9306 -u aaa -ss -e "SELECT id ,HIGHLIGHT({around=5},'msg') h, RAND() r FROM idx_highligh_performance_hit WHERE match('Led') ORDER BY r DESC LIMIT 0,1"
467910 <b>Led</b> Zeppelin abasement abaised acatharsy ablative ... 0.99792570
mysql --protocol=tcp -h localhost -P 9306 -u aaa -ss -e "SELECT id ,HIGHLIGHT({around=5},'msg') h, RAND() r FROM idx_highligh_performance_hit WHERE match('Led') ORDER BY r DESC LIMIT 0,1"
57870 <b>Led</b> Zeppelin abluted abrogable acana abolisher ... 0.99980712
prepare test SQL
awk 'BEGIN {for (i = 1; i <= 300; i ) { print "SELECT id ,HIGHLIGHT(), RAND() r FROM idx_highligh_performance_hit_local WHERE match(7Led7) ORDER BY r DESC LIMIT 0,30 OPTION threads=1; ";} }' > queries_local.txt
awk 'BEGIN {for (i = 1; i <= 300; i ) { print "SELECT id ,HIGHLIGHT(), RAND() r FROM idx_highligh_performance_hit_remote WHERE match(7Led7) ORDER BY r DESC LIMIT 0,30 OPTION threads=1;";} }' > queries_remote.txt
awk 'BEGIN {for (i = 1; i <= 300; i ) { print "SELECT id ,RAND() r FROM idx_highligh_performance_hit_local WHERE match(7Led7) ORDER BY r DESC LIMIT 0,30 OPTION threads=1; ";} }' > queries_local_no_highlight.txt
awk 'BEGIN {for (i = 1; i <= 300; i ) { print "SELECT id ,RAND() r FROM idx_highligh_performance_hit_remote WHERE match(7Led7) ORDER BY r DESC LIMIT 0,30 OPTION threads=1;";} }' > queries_remote_no_highlight.txt
run test SQL to see performance
time mysql --protocol=tcp -h localhost -P 9306 -u aaa -ss < queries_local.txt > res_queries_local.txt
time mysql --protocol=tcp -h localhost -P 9306 -u aaa -ss < queries_remote.txt > res_queries_remote.txt
time mysql --protocol=tcp -h localhost -P 9306 -u aaa -ss < queries_local_no_highlight.txt > res_queries_local_no_highlight.txt
time mysql --protocol=tcp -h localhost -P 9306 -u aaa -ss < queries_remote_no_highlight.txt > res_queries_remote_no_highlight.txt
observe that highlight on remote index performs ~x28 times worst than on the same local index.
results
time mysql --protocol=tcp -h localhost -P 9306 -u aaa -ss < queries_local.txt > res_queries_local.txt
real 0m1.876s
user 0m0.011s
sys 0m0.006s
time mysql --protocol=tcp -h localhost -P 9306 -u aaa -ss < queries_remote.txt > res_queries_remote.txt
real 0m52.648s
user 0m0.011s
sys 0m0.008s
When highlight not used, performance is slightly worst on remote index (as expected)
time mysql --protocol=tcp -h localhost -P 9306 -u aaa -ss < queries_local_no_highlight.txt > res_queries_local_no_highlight.txt
real 0m0.295s
user 0m0.005s
sys 0m0.005s
time mysql --protocol=tcp -h localhost -P 9306 -u aaa -ss < queries_remote_no_highlight.txt > res_queries_remote_no_highlight.txt
real 0m0.442s
user 0m0.006s
sys 0m0.004s
Expected behavior
HIGHLIGHT({around=5},msg) should return correct data.
HIGHLIGHT on remote index should perform much much better.
Describe the environment:
Manticore 6.0.5 844b1ae@230606 dev (columnar 2.0.5 d593e0d@230529) (secondary 2.0.5 d593e0d@230529)
Linux Alma-87-amd64-base 5.14.0-284.11.1.el9_2.x86_64 #1 SMP PREEMPT_DYNAMIC Tue May 9 05:49:00 EDT 2023 x86_64 x86_64 x86_64 GNU/Linux
Messages from log files:
Not applicable
Additional context
The text was updated successfully, but these errors were encountered:
Describe the bug
same local index.
To Reproduce
results:
observe that highlight on remote index performs ~x28 times worst than on the same local index.
results
When highlight not used, performance is slightly worst on remote index (as expected)
Expected behavior
HIGHLIGHT({around=5},msg) should return correct data.
HIGHLIGHT on remote index should perform much much better.
Describe the environment:
Manticore 6.0.5 844b1ae@230606 dev (columnar 2.0.5 d593e0d@230529) (secondary 2.0.5 d593e0d@230529)
Linux Alma-87-amd64-base 5.14.0-284.11.1.el9_2.x86_64 #1 SMP PREEMPT_DYNAMIC Tue May 9 05:49:00 EDT 2023 x86_64 x86_64 x86_64 GNU/Linux
Messages from log files:
Not applicable
Additional context
The text was updated successfully, but these errors were encountered: