- Documentation of Mastodon
- Installation Instructions
- Numerical Features added to Mastodon
- Hierarchical Clustering of Lineage Trees
- Dimensionality reduction
- Import
- Export
- Technical Information
- Acknowledgements
Mastodon Deep Lineage is an extension of Mastodon. For the full documentation of Mastodon, please visit mastodon.readthedocs.io.
- Add the listed Mastodon update sites in Fiji:
Feature name | Projections | Description | Formula/Visualisation |
---|---|---|---|
Branch N Leaves | idem | The total number of leaves of a branch spot in the whole track subtree of this branch spot. Example visualization: |
|
Branch N Successors and Predecessors | idem | Total number of successors and predecessors of a branch spot in the whole track subtree of this branch spot. Example visualization: Successors (encodes reproductivity) Predecessors (encodes generation) |
|
Branch Sinuosity | idem | The sinuosity of a spot during its life span (cf. Sinuosity), i.e. how much the track represented by the branch is curved. Values close to 1: almost straight movement. Values significantly higher than 1: winding or meandering movement. Positive infinity (∞), if the spot is at the end at the same position as at the beginning, but has moved in between. Example visualization: |
|
Branch Average Movement | idem | The average movement per frame of a spot during its life span. Example visualization: | e.g.: |
Branch Movement Direction | idem | The movement direction of a branch spot represented as a normalized directional vector pointing from the start (spot) position to the end (spot) position of the BranchSpot. Example visualizations: x-component y-component z-component |
|
Branch Cell Division Frequency | idem | Number of cell divisions in the subtree rooted at each Branch-spot divided by total duration of branches in this sub-tree. Example visualization: |
|
Branch Relative Movement* | x, y and z component | The x, y and z components of the average speed of a spot during its life span relative to its n nearest neighbors. The number of neighbors to be considered can be specified by the users. Default is 5. |
|
average speed | The average speed of a spot during its life span relative to its n nearest neighbors. Example visualization: |
'*' The relative movement features cannot be called from the FeatureComputer directly. Instead, they can be accessed via
the plugin menu: Plugins > Compute Features > Movement of spots relative to nearest neighbors
- Menu Location:
Plugins > Lineage Analysis > Hierarchical Clustering of Lineage Trees
- This command is capable of grouping similar lineage trees together.
- The linage clustering operates on Mastodon's branch graph.
- Lineage trees are considered similar, if they share a similar structure and thus represent a similar cell division pattern. The structure of a lineage tree is represented by the tree topology. This tree topology consists of the actual branching pattern and the cell lifetimes, i.e. the time points between two subsequent cell divisions.
- Functionality in a nutshell:
- The similarity of a pair of lineage trees is computed based on the Zhang edit distance for unordered trees (Zhang, K. Algorithmica 15, 205–222, 1996). This method captures the cost of the transformation of one tree into the other.
- The Zhang unordered edit distance allows the following edit operations. The edit operations are defined in a way that they satisfy the constraints elaborated in section 3.1 ("Constrained Edit Distance Mappings") of the paper: Zhang, K. Algorithmica 15, 205–222, 1996
Note: The prefix T may represent a node or a complete subtree. Nodes without this prefix are just nodes.
1. Change label
A A'
/ \ --> / \
TB TC TB TC
2a: Delete subtree (opposite of 2b)
A A
/ \ --> |
TB TC TB
2b: Insert subtree (opposite of 2a)
A A
| --> / \
TB TB TC
3a: Delete one child of a node and delete the node itself (opposite of 3b)
A A
/ \ --> / \
B TC TD TC
/ \
TD TE (delete TE and B, TD becomes child of A)
3b: Insert a node and insert one child at that node (opposite of 3a)
A A
/ \ --> / \
TB TC D TC
/ \
TB TE (insert D and TE, TB becomes child of D)
4a: Delete node and delete its sibling subtree (opposite of 4b)
A A
/ \ / \
B TC --> TD TE
/ \
TD TE (Node B and its sibling subtree TC are deleted and the children
of B, namely TD and TE, become the children of A)
4b: Insert node and insert a sibling subtree (opposite of 4a)
A A
/ \ / \
TB TC --> D TE
/ \
TB TC (Node D and its sibling TE are inserted,
TB and TC become the children of D)
As an example, the following case explicitly does not fulfill the constraints mentioned in the paper:
Delete a node without deleting one of its children
A A
/ \ --> / | \
B TC TD TE TC
/ \
TD TE (delete B, TD and TE become children of A and TC remains)
- A basic example of the tree edit distance:
Tree1
node1(node_weight=13)
┌──────────┴─────────────┐
│ │
node2(node_weight=203) node3(node_weight=203)
Tree2
node1(node_weight=12)
┌──────────┴─────────────┐
│ │
node2(node_weight=227) node3(node_weight=227)
┌──────────┴─────────────┐
node4(node_weight=10) node5(node_weight=10)
- Edit distance of 69, because:
- The similarity measure uses the attribute cell lifetime, which is computed as a difference of time points between two
subsequent divisions. There are multiple ways to compute the similarity between two lineage trees based on this
attribute:
- The sum of the edit distances as shown in the basic example above. Individual differences in the cell lifetimes may be normalized by their sum (i.e. local normalization)
- The sum of the edit distances as shown in the basic example above normalized by the maximum possible edit distances of the two trees (normalized zhang edit distance)
- The sum of the edit distances normalized by the number of the involved nodes (per branch zhang edit distance)
- The similarities are computed between all possible combinations of lineage trees leading to a two-dimensional similarity matrix. The values in this matrix are considered to reflect similarities of lineage trees. Low tree edit distances represent a high similarity between a discrete pair of lineage trees. This matrix is then used to perform an Agglomerative Hierarchical Clustering into a specifiable number of groups.
- For the clustering three different linkage methods can be chosen.
-
Crop criterion:
- The criterion for cropping the lineage trees
- Number of spots (default)
- Time point
-
Crop start
- At which number of spots / time point (depending on the chose crop criterion) the analysis should start
-
Crop end
- At which number of spots / time point (depending on the chose crop criterion) the analysis should end
-
Number of clusters
- How many groups the lineage trees should be assigned to by the clustering
- Must not be greater than the number of valid lineage trees
-
Minimum number of divisions
- The minimum number of divisions a lineage tree should have so that it is included in the analysis
-
Similarity measures:
- (default) 1,2
- 1
- Zhang Tree Edit Distance1,2
-
Linkage strategy for hierarchical clustering, cf. linkage methods
- Average (default)
- Single
- Complete
-
List of further projects
- If you have multiple similar projects, you can add them here to get an average clustering taking all projects into account.
- Mastodon projects can be added / removed using
- "Add files..."
- "Add folder content..."
- "Remove selected"
- "Clear list"
- Drag and drop of files and folders
- The name of the current open project is shown above the list. The current project is always included in the hierarchical clustering. It cannot be added to the list.
- It is important that the names of the roots of lineages in all projects included in the hierarchical clustering are the same. Otherwise, the hierarchical clustering will not work.
- The effect of adding further projects is that the similarity matrix is computed for each project separately and then averaged, resulting in a more robust hierarchical clustering.
-
Add generated tags to further projects
- If checked, the tags generated by the hierarchical clustering are also added to the further projects.
- Important note: this will write tags to these projects. Consider making a backup of the further projects before running the hierarchical clustering, if you choose this option.
-
Show dendrogram of hierarchical clustering of lineage trees
- If checked, the dendrogram is shown after the hierarchical clustering
-
Check validity of parameters
- Press this button to check, if with the current parameters a hierarchical clustering is possible
- If the parameters are invalid, a message will appear with the reason(s)
- Possible reasons for invalid parameters:
- The number of clusters is greater than the number of valid lineage trees
- The crop start is greater than the crop end
- The crop end is greater than the maximum number of spots / time points
- Further projects that are included in the hierarchical clustering could not be found / opened
- Demo data: Example data set
- The track scheme of the demo data contains 8 lineage tree in total. You may see that the "symmetric", the "asymmetric" and the "single division" trees look similar to each other, but dissimilar to the remaining trees.
- The hierarchical clustering dialog.
- Cf. section Parameters for the meaning of the parameters.
- Not visible to the user, a similarity matrix is computed based on the chosen similarity measure. For the demo data, the matrix looks like this. Highly similar trees have low distances in this matrix.
- The resulting dendrogram.
- User can toggle on/off root labels, tags, clustering threshold and median of the tree edit distances.
- If the option
Show tag labels
is checked, the tag set shown in the dendrogram can be chosen. - Export options for the dendrogram to SVG and PNG accessible via the context menu.
- The result of the hierarchical clustering can be exported to a CSV file via the context menu. The exported file contains the root names of the lineage trees, the tag set value, the assigned group and the similarity score. The similarity score indicates how similar the lineage trees in this group are. The lower the score, the more similar the trees are.
- The resulting tag set may be used for coloring the track scheme.
- The resulting tag set may be used for coloring the track scheme branch view.
- The resulting tag set may be used for coloring the spots in the BigDataViewer.
For visualizing high-dimensional data, e.g. in two dimensions, potentially getting more insights into your data, you can reduce the dimensionality of the measurements, using these algorithms:
- UMAP
- t-SNE
- Menu Location:
Plugins > Compute Feature > Dimensionality reduction
Select the graph type whose features should be dimensionality reduced, either the Model Graph with Features for Spots and Links or the Branch Graph with Features on BranchSpots and BranchLinks. Next, select the feature + feature projections that should be dimensionality reduced. Prefer to select features, which describe the phenotype (e.g. size, shape, velocity, number of neighbors, etc.). Only select positional features (e.g. centroid, coordinates, timeframe, etc.), if the position of cells within the image are descriptive for the phenotype. If you are unsure, you can select all features and then remove the positional features later.
The available algorithms reduce the dimensionality of the selected features and adds the results as a new feature to the table. In order to do so, the selected algorithm uses the data matrix from the spot or branch spot table, where each row represents a spot or branch spot and each column represents a feature. The link and branch link features can be included in the algorithm.
If they are selected, the algorithm will use the link feature value of its incoming edge or the average of all values of all incoming edges, if there is more than one incoming edge.
The dialog will look like this:
By default, all measurements are selected in the box.
- Standardize: Whether to standardize the data before reducing the dimensionality. Standardization is recommended when the data has different scales / units. Further reading: Standardization.
- Number of dimensions: The number of reduced dimensions to use. The default is 2, but 3 is also common. Further reading: Number of Dimensions.
- Number of neighbors: The size of the local neighborhood (in terms of number of neighboring sample points) used for manifold approximation. Larger values result in more global views of the manifold, while smaller values result in more local data being preserved. In general, it should be in the range 2 to 100. Further reading: Number of Neighbors.
- Minimum distance: The minimum distance that points are allowed to be apart from each other in the low dimensional representation. This parameter controls how tightly UMAP is allowed to pack points together. Further reading: Minimum Distance.
- Perplexity: The perplexity is related to the number of nearest neighbors that are used in other manifold learning algorithms. Larger datasets usually require a larger perplexity. The recommended range is between 5 and 50. Further reading: Perplexity.
- Maximum number of iterations: The maximum number of iterations for the optimization. The default is 1000. More iterations will give more accurate results, but will also take longer to compute. Further reading: Maximum Number of Iterations.
When you are done with the selection, click on Compute
.
The resulting values will be added as additional columns to the selected table.
You can visualize the results using the Grapher
View of Mastodon and selecting the newly added columns.
Visualization with the Mastodon Blender View is also possible.
The example above has been generated using the tgmm-mini dataset, which is included in the Mastodon repository.
- Menu Location:
File > Import > Import spots from label image
- You can use the plugin to import spots from a label image representing an instance segmentation into Mastodon. This may be useful if you have an instance segmentation of cells or other objects, and you want to track them using Mastodon.
- The label image is expected to contain the spot ids as pixel values.
- The label image is expected to have the same dimensions as the image data in Mastodon.
- Labels are processed frame by frame.
- Multiple blobs with the same id in the same frame are considered to belong to the same spot by this importer. It is advised to use unique ids for spots in the same frame.
- The resulting spots are ellipsoids with the semi axes computed from the variance covariance matrix of this pixel positions of each label.
- Labels with only one pixel are ignored. This is because the variance covariance matrix is not defined for a single
point. If you want to import single pixel spots, you can use the
Import Spots from CSV
plugin. - The resulting spots may be linked using the linker plugin in Mastodon (
Plugins > Tracking > Linking...
) or Elephant.
- Ellipsoid scaling factor: The scaling factor to apply to the ellipsoids. The default is 1.0. The scaling factor is applied to the semi axes of the ellipsoids. The ellipsoid scaling factor can be used to increase (>1) or decrease ( <1) the size of the resulting ellipsoid. 1 is equivalent of ellipsoids drawn at 2.2σ.
- Link spots having the same id in consecutive frames: If checked, spots with the same label id in consecutive frames are linked. Division or merge events are not considered.
- Image source that has been used for the external segmentation: The channel containing the image data that has been used to create the label image. This channel is used to check, if the dimensions of the label image match the dimensions of the image data in Mastodon.
- The label image can be opened in ImageJ and the plugin can be called from the
menu:
File > Import > Import spots from label image > Import spots from ImageJ image
- Please make sure that the label image is the active image in ImageJ.
- Please make sure that the label image has the same dimensions as the image data in Mastodon.
- You can use the
Image > Properties
command ImageJ to check (and) set the dimensions of the label image.
- You can use the
- You can also watch a video tutorial on how to import spots from a label image in Mastodon
- Example dataset: Fluo-C3DL-MDA231 from Cell Tracking Challenge
- Extract the file to a folder named
Fluo-C3DL-MDA231
- Import the image sequence with the actual image into ImageJ contained in folder
Fluo-C3DL-MDA231/01/
File > Import > Image Sequence...
- Set the dimensions of the image sequence to 512x512x1x30x12 (XYCTZ) using
Image > Properties
- Open Mastodon from Fiji and create a new project with the image sequence
- Import the image sequence encoding the label images into ImageJ contained in folder:
Fluo-C3DL-MDA231/01_ERR_SEG/
- Open Import window in Mastodon:
File > Import > Import spots from label image > Import spots from ImageJ image
- You can keep the ellipsoid scaling factor at 1.0. Select factor higher than 1.0 to increase the size of the ellipsoids and lower than 1.0 to decrease the size of the resulting ellipsoids.
- Check the box to link spots having the same label id in consecutive frames.
- Select the channel in Big Data Viewer containing the image that has been used to create the label image. The channel is used to check, if the dimensions of the label image in ImageJ match the dimensions of the image data in Mastodon.
-
- Click
OK
and the spots are imported into Mastodon.
- Click
- The plugin can be called from the
menu:
File > Import > Import spots from label image > Import spots from BDV channel
- You can also watch a video tutorial on how to import spots from a label image in Mastodon
- Example dataset: Fluo-C3DL-MDA231 from Cell Tracking Challenge
- Extract the file to a folder named
Fluo-C3DL-MDA231
- Import the image sequence with the actual image into ImageJ contained in folder
Fluo-C3DL-MDA231/01/
- Import the image sequence encoding the label images into ImageJ contained in folder:
Fluo-C3DL-MDA231/01_ERR_SEG/
- Open Mastodon from Fiji and create a new project with merged image
- Open Import window:
File > Import > Import spots from label image > Import spots from BDV channel
- Keep the ellipsoid scaling factor at 1.0
- You can decide to link spots having the same label id in consecutive frames. This is useful if you have a time series of label images and you want to link spots between frames. Linking dividing spots cannot be done by this. The Mastodon Linker plugin should be used for this.
- Select the BDV channel containing the label image that has been used to create the segmented label image. This is used to check, if the dimensions of the label image and the image data in BDV match, which is required.
-
- Click
OK
and the spots are imported into Mastodon.
- Click
- Menu Location:
File > Export > Export label image using ellipsoids
- The Label image exporter is capable of saving a label image to a file using the existing ellipsoids in Mastodon.
- For the labels, the spot ids, branch spot ids or the track ids that correspond to the spots / ellipsoids may be used. Since these Ids are counted zero based in Mastodon, an offset of 1 is added to all Ids so that no label clashes with the background of zero.
- The recommended export format is '*.tif'-files. However, it should work also for other formats supported by ImageJ.
- The export uses an image with signed integer value space, thus the maximum allowed id is 2.147.483.646.
- The dialog:
- Label Id: The id that is used for the labels. The default is the Spot track Id.
- Frame rate reduction: Only export every n-th frame. 1 means no reduction. Value must be >= 1.
- The frame number corresponds to the Spot frame column in the feature table.
- Resolution level: Spatial resolution level of export. 0 means highest resolution. Value > 0 mean lower resolution.
- Save to: Path to the file to save the label image to. Should end with '.tif'.
- Demo data: Example data set
- The timelapse with the ellipsoids in BigDataViewer:
- The exported tif imported into Napari 3D view:
- Menu Location:
File > Export > Export to GraphML (branches)
- Exports the branch graph to a GraphML file.
- The graph is directed. The branch spots are the vertices and the branch links are the edges.
- The vertices receive a label attribute with the branch spot name. The vertices receive a duration attribute with the branch duration.
- The edges are not labeled and have no attributes.
- GraphML can be visualized with Cytoscape, yEd or Gephi.
- GraphML can be processed in Java using the JGraphT library.
- GraphML can be processed in Python using the NetworkX library.
- Export all branches to GraphML (one file)
- Exports the whole branch graph to a single file.
- Select a file to save to. Should end with '.graphml'.
- Export selected branches to GraphML (one file)
- Exports the selected branches to a single file.
- The selected branches are the ones that are highlighted in the branch view.
- A branch is considered selected if at least one of its spots is selected. However, the exported duration attribute always reflects the whole branch duration.
- Select a file to save to. Should end with '.graphml'.
- Exports the selected branches to a single file.
- Export tracks to GraphML (one file per track)
- Exports each track to a separate file.
- Select a directory to save to.
- Demo data: Example data set
- The resulting file loaded into yEd:
- The resulting file loaded into Cytoscape:
- You are welcome to submit Pull Requests to this repository. This repository runs code analyses on every Pull Request using SonarCloud.
- Please read the general advice re contributing to Mastodon and its plugins.
- If you would like to contribute to this documentation, feel free to open a pull request. The documentation is written in Markdown format.
- The development of this plugin was supported by the DFG under grant 490966236 and the ANR under grant ANR-21-CE13-0044.