flucoma · tedmoore · May 26, 2022 · May 4, 2022 · May 4, 2022 · May 5, 2022
diff --git a/doc/MDS.rst b/doc/MDS.rst
@@ -1,30 +1,73 @@
-:digest: Dimensionality Reduction with Multidimensional Scaling
+:digest: Multidimensional Scaling
 :species: data
 :sc-categories: Dimensionality Reduction, Data Processing
 :sc-related: Classes/FluidMDS, Classes/FluidDataSet
 :see-also: 
 :description: 
-   Multidimensional scaling of a :fluid-obj:`DataSet`
 
-   https://scikit-learn.org/stable/modules/manifold.html#multi-dimensional-scaling-mds
+   Dimensionality Reduction of a :fluid-obj:`DataSet` Using Multidimensional Scaling
 
+:discussion:
 
+   Multidimensional Scaling transforms a dataset to a lower number of dimensions while trying to preserve the distance relationships between the data points, so that even with fewer dimensions, the differences and similarities between points can still be observed and used effectively. 
+
+   First, MDS computes a distance matrix by calculating the distance between every pair of points in the dataset. It then positions all the points in the lower number of dimensions (specified by ``numDimensions``) and iteratively shifts them around until the distances between all the points in the lower number of dimensions is as close as possible to the distances in the original dimensional space.
+
+   What makes this MDS implementation more flexible than some of the other dimensionality reduction algorithms in FluCoMa is that MDS allows for different measures of distance to be used (see list below). 
+
+   Note that unlike the other dimensionality reduction algorithms, MDS does not have a ``fit`` or ``transform`` method, nor does it have the ability to transform data points in buffers. This is essentially because the algorithm needs to do the fit & transform as one with just the data provided in the source DataSet and therefore incorporating new data points would require a re-fitting of the model.
+
+   **Manhattan Distance:** The sum of the absolute value difference between points in each dimension. This is also called the Taxicab Metric. https://en.wikipedia.org/wiki/Taxicab_geometry
+
+   **Euclidean Distance:** Square root of the sum of the squared differences between points in each dimension (Pythagorean Theorem) https://en.wikipedia.org/wiki/Euclidean_distance This metric is the default, as it is the most commonly used.
+
+   **Squared Euclidean Distance:** Square the Euclidean Distance between points. This distance measure more strongly penalises larger distances, making them seem more distant, which may reveal more clustered points. https://en.wikipedia.org/wiki/Euclidean_distance#Squared_Euclidean_distance
+
+   **Minkowski Max Distance:** The distance between two points is reported as the largest difference between those two points in any one dimension. Also called the Chebyshev Distance or the Chessboard Distance. https://en.wikipedia.org/wiki/Chebyshev_distance
+
+   **Minkowski Min Distance:** The distance between two points is reported as the smallest difference between those two points in any one dimension.
+
+   **Symmetric Kullback Leibler Divergence:** Because the first part of this computation uses the logarithm of the values, using the Symmetric Kullback Leibler Divergence only makes sense with non-negative data. https://en.wikipedia.org/wiki/Kullback%E2%80%93Leibler_divergence#Symmetrised_divergence
+
+   .. **Cosine Distance:** Cosine Distance considers each data point a vector in Cartesian space and computes the angle between the two points. It first normalizes these vectors so they both sit on the unit circle and then finds the dot product of the two vectors which returns a calculation of the angle. Therefore this measure does not consider the magnitudes of the vectors when computing distance. https://en.wikipedia.org/wiki/Cosine_similarity (This article describes the cosine _similarity_, as opposed to distance, however since the cosine similarity is always between -1 and 1, the distance is computed as 1 - cosine similarity, which will always range from a minimum distance of 0 to a maximum distance of 2.)
 
 :control numDimensions:
 
    The number of dimensions to reduce to
 
 :control distanceMetric:
 
-   The distance metric to use (integer, 0-6, see flags above)
+   The distance metric to use (integer 0-5)
+
+   :enum:
+
+    :0: 
+      Manhattan Distance
+
+    :1: 
+      Euclidean Distance (default)
+
+    :2: 
+      Squared Euclidean Distance
+
+    :3: 
+      Minkowski Max Distance
+
+    :4: 
+      Minkowski Min Distance
+
+    :5: 
+      Symmetric Kullback Leibler Divergance
 
+    .. :6: 
+    ..   Cosine Distance
 
 :message fitTransform:
 
-   :arg sourceDataSet: Source data, or the DataSet name
+   :arg sourceDataSet: Source DataSet
 
-   :arg destDataSet: Destination data, or the DataSet name
+   :arg destDataSet: Destination DataSet
 
    :arg action: Run when done
 
-   Fit the model to a :fluid-obj:`DataSet` and write the new projected data to a destination FluidDataSet.
+   Fit the model to a :fluid-obj:`DataSet` and write the new projected data to a destination DataSet.
diff --git a/doc/MLPClassifier.rst b/doc/MLPClassifier.rst
@@ -1,69 +1,85 @@
 :digest: Classification with a multi-layer perceptron
 :species: data
 :sc-categories: Machine learning
-:sc-related: Classes/FluidMLPRegressor, Classes/FluidDataSet
+:sc-related: Classes/FluidMLPRegressor, Classes/FluidDataSet, Classes/FluidLabelSet
 :see-also: 
-:description: Perform classification between a :fluid-obj:`DataSet` and a :fluid-obj:`LabelSet` using a Multi-Layer Perception neural network.
+:description: 
 
+  Perform classification between a :fluid-obj:`DataSet` and a :fluid-obj:`LabelSet` using a Multi-Layer Perception neural network.
 
-:control hidden:
+:discussion:  
 
-   An ``Classes/Array`` that gives the sizes of any hidden layers in the network (default is two hidden layers of three units each).
+  For a thorough explanation of how this object works and more information on the parameters, visit the page on **MLP Training** (https://learn.flucoma.org/learn/mlp-training) and **MLP Parameters** (https://learn.flucoma.org/learn/mlp-parameters).
+
+:control hiddenLayers:
+
+   An array of numbers that specifies the internal structure of the neural network. Each number in the list represents one hidden layer of the neural network, the value of which is the number of neurons in that layer. Changing this will reset the neural network, clearing any learning that has happened.
 
 :control activation:
 
-   The activation function to use for the hidden layer units. Beware of the permitted ranges of each: relu (0->inf), sigmoid (0->1), tanh (-1,1).
+   An integer indicating which activation function each neuron in the hidden layer(s) will use. Changing this will reset the neural network, clearing any learning that has happened. The options are:
+
+   :enum:
+
+      :0:
+         **identity** (the output range can be any value)
+
+      :1:
+         **sigmoid** (the output will always range be greater than 0 and less than 1)
+
+      :2:
+         **relu** (the output will always be greater than or equal to 0)
+
+      :3:
+         **tanh** (the output will always be greater than -1 and less than 1) 
 
 :control maxIter:
 
-   The maximum number of iterations to use in training.
+   The number of epochs to train for when ``fit`` is called on the object. An epoch is consists of training on all the data points one time.
 
 :control learnRate:
 
-   The learning rate of the network. Start small, increase slowly.
+   A scalar for indicating how much the neural network should adjust its internal parameters during training. This is the most important parameter to adjust while training a neural network. 
 
 :control momentum:
 
-   The training momentum, default 0.9
+   A scalar that applies a portion of previous adjustments to a current adjustment being made by the neural network during training.
 
 :control batchSize:
 
-   The training batch size.
+   The number of data points to use in between adjustments of the MLP's internal parameters during training.
 
 :control validation:
 
-   The fraction of the DataSet size to hold back during training to validate the network against.
-
+   A percentage (represented as a decimal) of the data points to randomly select, set aside, and not use for training (this "validation set" is reselected on each ``fit``). These points will be used after each epoch to check how the neural network is performing. If it is found to be no longer improving, training will stop, even if a ``fit`` has not reached its ```maxIter`` number of epochs.
 
 :message fit:
 
    :arg sourceDataSet: Source data
 
-   :arg targetLabelSet: Target data
-
-   :arg action: Function to run when training is complete
+   :arg targetLabelSet: Target labels
 
-   Train the network to map between a source :fluid-obj:`DataSet` and a target :fluid-obj:`LabelSet`
+   :arg action: Function to run when complete. This function will be passed the current error as its only argument.
+
+   Train the network to map between a source :fluid-obj:`DataSet` and target :fluid-obj:`LabelSet`
 
 :message predict:
 
    :arg sourceDataSet: Input data
 
-   :arg targetLabelSet: Output data
+   :arg targetLabelSet: :fluid-obj:`LabelSet` to write the predicted labels into
 
    :arg action: Function to run when complete
 
-   Apply the learned mapping to a :fluid-obj:`DataSet` (given a trained network)
+   Predict labels for a :fluid-obj:`DataSet` (given a trained network)
 
 :message predictPoint:
 
    :arg sourceBuffer: Input point
 
-   :arg targetBuffer: Output point
-
-   :arg action: A function to run when complete
+   :arg action: A function to run when complete. This function will be passed the predicted label.
 
-   Apply the learned mapping to a single data point in a |buffer|
+   Predict a label for a single data point in a |buffer|
 
 :message clear:
 

diff --git a/doc/MLPRegressor.rst b/doc/MLPRegressor.rst
@@ -3,58 +3,76 @@
 :sc-categories: Machine learning
 :sc-related: Classes/FluidMLPClassifier, Classes/FluidDataSet
 :see-also: 
-:description: Perform regression between :fluid-obj:`DataSet`\s using a Multi-Layer Perception neural network.
+:description: 
 
+  Perform regression between :fluid-obj:`DataSet`\s using a Multi-Layer Perception neural network.
 
-:control hidden:
+:discussion:
 
-   An ``Classes/Array`` that gives the sizes of any hidden layers in the network (default is two hidden layers of three units each).
+  For a thorough explanation of how this object works and more information on the parameters, visit the page on **MLP Training** (https://learn.flucoma.org/learn/mlp-training) and **MLP Parameters** (https://learn.flucoma.org/learn/mlp-parameters).
+
+:control hiddenLayers:
+
+   An array of numbers that specifies the internal structure of the neural network. Each number in the list represents one hidden layer of the neural network, the value of which is the number of neurons in that layer. Changing this will reset the neural network, clearing any learning that has happened.
 
 :control activation:
 
-   The activation function to use for the hidden layer units. Beware of the permitted ranges of each: relu (0->inf), sigmoid (0->1), tanh (-1,1)
+   An integer indicating which activation function each neuron in the hidden layer(s) will use. Changing this will reset the neural network, clearing any learning that has happened. The options are:
+
+   :enum:
+
+     :0: 
+      **identity** (the output range can be any value)
+
+     :1: 
+      **sigmoid** (the output will always range be greater than 0 and less than 1)
+
+     :2: 
+      **relu** (the output will always be greater than or equal to 0)
+
+     :3: 
+      **tanh** (the output will always be greater than -1 and less than 1) 
 
 :control outputActivation:
 
-   The activation function to use for the output layer units. Beware of the permitted ranges of each: relu (0->inf), sigmoid (0->1), tanh (-1,1)
+   An integer indicating which activation function each neuron in the output layer will use. Options are the same as ``activation``. Changing this will reset the neural network, clearing any learning that has happened.
 
 :control tapIn:
 
-   The layer whose input is used to predict and predictPoint. It is 0 counting, where the default of 0 is the input layer, and 1 would be the first hidden layer, and so on.
+   The index of the layer to use as input to the neural network for ``predict`` and ``predictPoint`` (zero counting). The default of 0 is the first layer (the original input layer), 1 is the first hidden layer, etc. This can be used to access different parts of a trained neural network such as the encoder or decoder of an autoencoder (https://towardsdatascience.com/auto-encoder-what-is-it-and-what-is-it-used-for-part-1-3e5c6f017726).
 
 :control tapOut:
 
-   The layer whose output to return. It is counting from 0 as the input layer, and 1 would be the first hidden layer, and so on. The default of -1 is the last layer of the whole network.
+   The index of the layer to use as output of the neural network for ``predict`` and ``predictPoint`` (zero counting). The default of -1 is the last layer (the original output layer). This can be used to access different parts of a trained neural network such as the encoder or decoder of an autoencoder (https://towardsdatascience.com/auto-encoder-what-is-it-and-what-is-it-used-for-part-1-3e5c6f017726).
 
 :control maxIter:
 
-   The maximum number of iterations to use in training.
+   The number of epochs to train for when ``fit`` is called on the object. An epoch is consists of training on all the data points one time.
 
 :control learnRate:
 
-   The learning rate of the network. Start small, increase slowly.
+   A scalar for indicating how much the neural network should adjust its internal parameters during training. This is the most important parameter to adjust while training a neural network. 
 
 :control momentum:
 
-   The training momentum, default 0.9
+   A scalar that applies a portion of previous adjustments to a current adjustment being made by the neural network during training.
 
 :control batchSize:
 
-   The training batch size.
+   The number of data points to use in between adjustments of the MLP's internal parameters during training.
 
 :control validation:
 
-   The fraction of the DataSet size to hold back during training to validate the network against.
-
+   A percentage (represented as a decimal) of the data points to randomly select, set aside, and not use for training (this "validation set" is reselected on each ``fit``). These points will be used after each epoch to check how the neural network is performing. If it is found to be no longer improving, training will stop, even if a ``fit`` has not reached its ```maxIter`` number of epochs.
 
 :message fit:
 
    :arg sourceDataSet: Source data
 
    :arg targetDataSet: Target data
 
-   :arg action: Function to run when training is complete
-
+   :arg action: Function to run when complete. This function will be passed the current error as its only argument.
+   
    Train the network to map between a source and target :fluid-obj:`DataSet`
 
 :message predict: