Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vecsim index performance: hnswlib::L2SqrSIMD4Ext takes 39.35% of on-cpu cycles ( _mm_add_ps and _mm_loadu_ps are related to it ) #28

Open
filipecosta90 opened this issue Jul 28, 2021 · 0 comments

Comments

@filipecosta90
Copy link
Contributor

Sample rdb loadable via vecsim search brach:

s3://benchmarks.redislabs/redisearch/vecsim/ann-benchmarks/glove-100-angular/dump.rdb

Sample query:

627473461.931705 [0 127.0.0.1:41480] "HSET" "ann_14541" "vector" "\x98Q\xec\xbd\x84\x81\xf7>\xb4<o\xbe*\xe3\xbf>m\xc5\xde\xbe\xf7u\b\xbfK\x02\x14>V\x82%\xbe\x02\x829>W\xb1x<y\x1e$?\xd69\xd6>\xa8Wz?;\xdf\x87\xbf\xa3\x92\xca?=\xb8\xdb\xbe\xcc\xb45?\xdc\x7f\xa4=\x0eJ(?\x97sy?W\xcf\x01?+\xa4\x9c\xbeKY\xa6\xbe\xdb3\xbb>\xee_\xb9>\x9c\xdc\xdf>\xb0\xac4\xbd\x92\a\x82\xbd\xd6s\xb2\xbeU\xde\x1e?E\rN?e\x01\x93<\x1f\xbf7>\xa1\xd6\x14\xbf\x97\xa8\xde\xbe\x1d8\xf7\xbeX9\x84>\xa0\x1a\x97?\x12\xa2\xfc\xbc\xd9|\xac>\x9c\xa7\xba\xbe\xbb\xf2Y>=~\xcf\xbe\x93\xa9\x02>\xa0\xe0\x82\xbe\xffx\xef=u\xc85\xbf\x05\xa2\xa7\xbd\xb1\x16_\xbeJ\x98\t?\xe6t\x19\xbe\xef\xac\xdd=\xf4\xa6:?\xf6E\x02\xbe\x9f\xc8\x0b?ms\x83\xbe\xc8\xef\x05\xbfi5\x04\xbe\xe2X\a?O\xcc\x9a>*o\x17?_{.?8g\x94?\xee\xb1\x84\xbe\xdf\xa6W\xbf>\"\xe6\xbdk\x9a7\xbd\xfb\\M>\x1f\x11\xd3\xbe\\ y?\xcb\xdb1>\"\xc3\x12?+\x18\x95\xbe\xf2\xb5\x1f\xbf\xc4%\xb7\xbe\xb5\x1a\x1a?WC\xa2>\xb9\x88o>$\xd6J?\x89_1\xbc_{F>\x89\xb4\x8d\xbdI.\x8f\xbf\xa2b\x9c>\xaa\xb7\xd6\xbe\xbb\x0f ?z\xc2r>\xa1\x10\xc1\xbe\xda \x03\xbf\xbe\xf6\x8c>\xa1\x83\xae\xbc\xbaI\xb4?\x96&\x85\xbd\xc4\xeb\xca\xbe\xee\xce\x1a?\xf47\x81\xbe\x89\b\xbf\xbd\x99\r\xe2>;p\x8e\xbe/\xf8t="

Top on CPU consumers:

Flat Flat% Sum% Cum Cum% Name Inlined?
38718330030 17.65% 17.65% 38766179198 17.67% _mm_loadu_ps (inline)
18647707936 8.50% 26.15% 18666112242 8.51% _mm_add_ps (inline)
15234704835 6.94% 33.09% 86321125827 39.35% hnswlib::L2SqrSIMD4Ext  
10545240828 4.81% 37.90% 10556520348 4.81% _mm_mul_ps (inline)
3086731940 1.41% 39.31% 3086731940 1.41% _mm_sub_ps (inline)
0 0.00% 39.31% 86317305756 39.34% vectorIndexer (inline)
0 0.00% 39.31% 86317305756 39.34% moduleNotifyKeyspaceEvent  
0 0.00% 39.31% 86317305756 39.34% indexBulkFields  
0 0.00% 39.31% 68849073985 31.38% hnswlib::HierarchicalNSW::searchBaseLayer  
0 0.00% 39.31% 12029285653 5.48% hnswlib::HierarchicalNSW::mutuallyConnectNewElement  
0 0.00% 39.31% 86258259186 39.32% hnswlib::HierarchicalNSW::addPoint  
0 0.00% 39.31% 86317305756 39.34% Indexes_UpdateMatchingWithSchemaRules  
0 0.00% 39.31% 86317305756 39.34% Indexer_Process  
0 0.00% 39.31% 86317305756 39.34% Indexer_Add  
0 0.00% 39.31% 86317305756 39.34% IndexerBulkAdd  
0 0.00% 39.31% 86317305756 39.34% IndexSpec_UpdateDoc  
0 0.00% 39.31% 86317305756 39.34% HashNotificationCallback  
0 0.00% 39.31% 86317305756 39.34% HNSWIndex_AddVector  
0 0.00% 39.31% 86317305756 39.34% Document_AddToIndexes  
0 0.00% 39.31% 86317305756 39.34% AddDocumentCtx_Submit  

Flame Chart detail of hnswlib::L2SqrSIMD4Ext cpu cycles

image

Link:
https://s3.amazonaws.com/benchmarks.redislabs/redisearch/vecsim/perf-tasks/ann-benchmarks/glove-100-angular/ann-benchmark-indexing.svg

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant