-
Notifications
You must be signed in to change notification settings - Fork 10
MDS RST & SC example code #121
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
22 commits
Select commit
Hold shift + click to select a range
b819a6b
MLPClassifier SC example code
tedmoore abbbe91
Merge branch 'dev' into MLPs
tedmoore e8bc272
MLPClassifier SC example code ⚠️
tedmoore d814341
MLPs SC example code and RSTs ⚠️
tedmoore c29a3f5
Merge branch 'dev' into MLPs
tedmoore 0e9157a
⚠️
tedmoore fd536d8
⚠️
tedmoore e023fb1
wip
tedmoore 8a631af
MLP RSTs
tedmoore 7d9636d
[FIX] this commit adds code to the SC templates to allow enums in the…
tedmoore 8ace05b
feedback
tedmoore 0b4c755
hidden -> hiddenLayers
tedmoore 910654b
⚠️
tedmoore 0dbc121
Merge branch 'dev' into MDS-RST-SC-examples
tedmoore c9f2cd0
Merge branch 'MLPs' into MDS-RST-SC-examples
tedmoore 2b98daa
sc example ⚠️
tedmoore 2254a09
sc example code
tedmoore 8f5c985
bump
tedmoore 54cb14e
typo
tedmoore 8808c9e
removed cosine
tedmoore a163f0d
weefuzzy and james feedback
tedmoore 09fccba
fixed plot size
tedmoore File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,30 +1,73 @@ | ||
| :digest: Dimensionality Reduction with Multidimensional Scaling | ||
| :digest: Multidimensional Scaling | ||
| :species: data | ||
| :sc-categories: Dimensionality Reduction, Data Processing | ||
| :sc-related: Classes/FluidMDS, Classes/FluidDataSet | ||
| :see-also: | ||
| :description: | ||
| Multidimensional scaling of a :fluid-obj:`DataSet` | ||
|
|
||
| https://scikit-learn.org/stable/modules/manifold.html#multi-dimensional-scaling-mds | ||
| Dimensionality Reduction of a :fluid-obj:`DataSet` Using Multidimensional Scaling | ||
|
|
||
| :discussion: | ||
|
|
||
| Multidimensional Scaling transforms a dataset to a lower number of dimensions while trying to preserve the distance relationships between the data points, so that even with fewer dimensions, the differences and similarities between points can still be observed and used effectively. | ||
|
|
||
| First, MDS computes a distance matrix by calculating the distance between every pair of points in the dataset. It then positions all the points in the lower number of dimensions (specified by ``numDimensions``) and iteratively shifts them around until the distances between all the points in the lower number of dimensions is as close as possible to the distances in the original dimensional space. | ||
|
|
||
| What makes this MDS implementation more flexible than some of the other dimensionality reduction algorithms in FluCoMa is that MDS allows for different measures of distance to be used (see list below). | ||
|
|
||
| Note that unlike the other dimensionality reduction algorithms, MDS does not have a ``fit`` or ``transform`` method, nor does it have the ability to transform data points in buffers. This is essentially because the algorithm needs to do the fit & transform as one with just the data provided in the source DataSet and therefore incorporating new data points would require a re-fitting of the model. | ||
|
|
||
| **Manhattan Distance:** The sum of the absolute value difference between points in each dimension. This is also called the Taxicab Metric. https://en.wikipedia.org/wiki/Taxicab_geometry | ||
|
|
||
| **Euclidean Distance:** Square root of the sum of the squared differences between points in each dimension (Pythagorean Theorem) https://en.wikipedia.org/wiki/Euclidean_distance This metric is the default, as it is the most commonly used. | ||
tedmoore marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| **Squared Euclidean Distance:** Square the Euclidean Distance between points. This distance measure more strongly penalises larger distances, making them seem more distant, which may reveal more clustered points. https://en.wikipedia.org/wiki/Euclidean_distance#Squared_Euclidean_distance | ||
|
|
||
| **Minkowski Max Distance:** The distance between two points is reported as the largest difference between those two points in any one dimension. Also called the Chebyshev Distance or the Chessboard Distance. https://en.wikipedia.org/wiki/Chebyshev_distance | ||
|
|
||
| **Minkowski Min Distance:** The distance between two points is reported as the smallest difference between those two points in any one dimension. | ||
|
|
||
| **Symmetric Kullback Leibler Divergence:** Because the first part of this computation uses the logarithm of the values, using the Symmetric Kullback Leibler Divergence only makes sense with non-negative data. https://en.wikipedia.org/wiki/Kullback%E2%80%93Leibler_divergence#Symmetrised_divergence | ||
|
|
||
| .. **Cosine Distance:** Cosine Distance considers each data point a vector in Cartesian space and computes the angle between the two points. It first normalizes these vectors so they both sit on the unit circle and then finds the dot product of the two vectors which returns a calculation of the angle. Therefore this measure does not consider the magnitudes of the vectors when computing distance. https://en.wikipedia.org/wiki/Cosine_similarity (This article describes the cosine _similarity_, as opposed to distance, however since the cosine similarity is always between -1 and 1, the distance is computed as 1 - cosine similarity, which will always range from a minimum distance of 0 to a maximum distance of 2.) | ||
|
|
||
| :control numDimensions: | ||
|
|
||
| The number of dimensions to reduce to | ||
|
|
||
| :control distanceMetric: | ||
|
|
||
| The distance metric to use (integer, 0-6, see flags above) | ||
| The distance metric to use (integer 0-5) | ||
|
|
||
| :enum: | ||
|
|
||
| :0: | ||
| Manhattan Distance | ||
|
|
||
| :1: | ||
| Euclidean Distance (default) | ||
|
|
||
| :2: | ||
| Squared Euclidean Distance | ||
|
|
||
| :3: | ||
| Minkowski Max Distance | ||
|
|
||
| :4: | ||
| Minkowski Min Distance | ||
|
|
||
| :5: | ||
| Symmetric Kullback Leibler Divergance | ||
|
|
||
| .. :6: | ||
| .. Cosine Distance | ||
|
|
||
| :message fitTransform: | ||
|
|
||
| :arg sourceDataSet: Source data, or the DataSet name | ||
| :arg sourceDataSet: Source DataSet | ||
|
|
||
| :arg destDataSet: Destination data, or the DataSet name | ||
| :arg destDataSet: Destination DataSet | ||
|
|
||
| :arg action: Run when done | ||
|
|
||
| Fit the model to a :fluid-obj:`DataSet` and write the new projected data to a destination FluidDataSet. | ||
| Fit the model to a :fluid-obj:`DataSet` and write the new projected data to a destination DataSet. | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.