Replies: 2 comments 3 replies
-
please provide the code you are using to scale up the data. A sample of the data will also be helpful. |
Beta Was this translation helpful? Give feedback.
0 replies
-
I just tried this with my own private dataset. Around 500_000 shot texts, with around 50_000 unique words. Works without problem on the latest version installed via conda.. So indeed, reproducible example would be nice (essential actually.. ) |
Beta Was this translation helpful? Give feedback.
3 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi,
I just start to use Vaex yesterday so maybe the answer is obvious but i didn’t find anything yet.
I have a column of string with several words (~3m rows, up to ~150words / string) and I want to count words on the entire frame.
By testing on a small scale, everything works fine
`
text = ['Something', 'very pretty', 'is coming', 'our', 'way.']
df = vaex.from_arrays(text=text)
test = df.text.str.split(' ')
df['test'] = test
df.test.value_counts()
way. 1
is 1
pretty 1
our 1
Something 1
coming 1
very 1
dtype: int64`
But when I try to scale up to the actual set of data, I have this error : AttributeError: 'pyarrow.lib.ChunkedArray' object has no attribute 'values'
and I don’t understand how to work with this 😢
Beta Was this translation helpful? Give feedback.
All reactions