Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Will it work on high dimensional (but sparse) encoded data? [Question] #14

Open
UnixJunkie opened this issue Oct 6, 2021 · 3 comments

Comments

@UnixJunkie
Copy link

Hi Helena!
Let's say my data are represented (X) with a sparse positive integer vector of dimension about 20,000.
And let's say I have a few thousands such data points.
Will ugtm work, or not?

I am asking before I start to invest manpower into using ugtm...

Thanks a lot!
F.

@UnixJunkie UnixJunkie changed the title Will it work on high dimensional (but sparse) encoded data? Will it work on high dimensional (but sparse) encoded data? [Question] Oct 6, 2021
@hagax8
Copy link
Owner

hagax8 commented Oct 6, 2021

Hi there! It would work but in my experience the method is very sensitive to the curse of dimensionality, which is one of its main drawbacks: you will have lots of points mapped to the same coordinates. I'd use PCA preprocessing to get a 50-100 dimensional feature space (depending on the % variance explained...)

@UnixJunkie
Copy link
Author

Could the package be parameterized by a distance function?
I.e. even if the points are high-dimensional, I provide a distance function which behaves well on them
and the calculation should use the distance function I provide rather than processing the X coordinates directly.

@lyiuan
Copy link

lyiuan commented Aug 8, 2024

Hello, this might be a silly question, but I hope to get your reply.
What is the maximum dimension that GTM can typically handle?
Additionally, after using PCA for dimensionality reduction, is the dimensionality of the GTM inverse mapping also reduced? If I want the inverse mapping to return to the original dimensions, can I still use PCA for dimensionality reduction?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants