ValueError: setting an array element with a sequence.Using pysparnn.matrix_distance.SlowEuclideanDistance #27

liuchenbaidu · 2019-08-14T03:37:33Z

import pysparnn.cluster_index as ci

from sklearn.feature_extraction.text import TfidfVectorizer
import pysparnn
data = [
'hello world',
'oh hello there',
'Play it',
'Play it again Sam',
]
data=['你在干什么',
'你在干啥子',
'你在做什么',
'你好啊',
'我喜欢吃香蕉']

tv = TfidfVectorizer()
tv.fit(data)

features_vec = tv.transform(data)
print(type(features_vec),features_vec.shape)

build the search index!

cp = ci.MultiClusterIndex(features_vec, data,pysparnn.matrix_distance.SlowEuclideanDistance)

search the index with a sparse matrix

search_data = [
'oh there',
'Play it again Frank'
]

search_data = [
'你在干啥','我喜欢吃香蕉'
]
search_features_vec = tv.transform(search_data)

res=cp.search(search_features_vec, k=3, k_clusters=3, return_distance=False)

print(res)

kchaliki · 2020-02-04T20:45:46Z

@liuchenbaidu indeed that code doesn't work with sparse matrices, the test actually uses dense which is why this went unnoticed. I did implement this separately somewhere using scikit's euclidean distance but it is so much slower than cosine that it begs the question whether you need it.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ValueError: setting an array element with a sequence.Using pysparnn.matrix_distance.SlowEuclideanDistance #27

ValueError: setting an array element with a sequence.Using pysparnn.matrix_distance.SlowEuclideanDistance #27

liuchenbaidu commented Aug 14, 2019

kchaliki commented Feb 4, 2020

ValueError: setting an array element with a sequence.Using pysparnn.matrix_distance.SlowEuclideanDistance #27

ValueError: setting an array element with a sequence.Using pysparnn.matrix_distance.SlowEuclideanDistance #27

Comments

liuchenbaidu commented Aug 14, 2019

build the search index!

search the index with a sparse matrix

kchaliki commented Feb 4, 2020