-
Notifications
You must be signed in to change notification settings - Fork 13
/
Copy pathNEWS
479 lines (401 loc) · 38.3 KB
/
NEWS
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
Changes in version 2.23.1 (Release date: April 23, 2024)
==============
* Removed option "MB" for argument `mergeMethod` for the function `mergeClusters`. This was a method of Meinshausen and Buhlmann
(2005) to estimate the number of non-null genes, but their supporting package `howmany` that implemented the method is no longer supported on CRAN.
Changes in version 2.17.3 (Release date: May 12, 2022)
==============
* Fixed error in `plotBarplot` in handling non-assigned (-1) when comparing two clusterings (colors to clusters would be incorrect)
* Fixed error in subsetting of `ClusterExperiment` object if the unassigned cluster category (-1) was given a label in `clusterLegend` different than "-1" (would make those samples regular cluster).
Changes in version 2.7.3 (Release date: April 27, 2020)
==============
* Changes to clustering internals (`mainClustering`, `seqCluster`, `subsampleClustering`) and `clusterSingle`.
- they no longer take diss/x, but just 'inputMatrix' and 'inputType', so allow additional input types to more easily be added in future.
- Introduce `inputType="cat"` for categorical matrix to all the intermediate functions
- No longer allow `distFunction` as argument to internal clustering functions. These functions no longer will calculate distances, only the top level function, `clusterSingle`. This will keep from internally calculating distances without the user being aware
- error handling is different, as are warnings/messages. Single internal function `.checkArgs` now checks match between arguments and combinations of arguments, used at all functions (including `clusterMany`).
- There is no calculating of dissimilarity matrices unless `makeMissingDiss=TRUE` (new argument). This argument is also in `clusterMany` and `RSEC`, so that no distances are calculated without explicit user request.
* Changes to clustering functions:
- In order to cluster the results of subsampling, a clustering function must now accept data of form `cat` (not `dist` as before), i.e. the BxN matrix of clusterings from subsampling. Built in clustering techniques that took type `diss` (`pam`, `hier01` and `hierK`) have been adapted to do this.
- Built-in functions that accept `cat` also have functionality to reduce the BxN matrix to a B x m matrix, where m is the number of unique values of clusterings. This does affect the clustering results, as taking the average, etc. between samples will be affected, but for extremely large datasets can make the clustering more feasible. This option can be set in the `clusterArgs` list of options sent to the function: `clusterArgs=list(removeDup=TRUE)`.
- The data types that the clustering function takes should now be defined as a *vector* of acceptable types (previously was a single value, and was one of "diss", "X", and "both". Now the "both" option would be given as c("X","diss")).
* Changes to `makeConsensus`:
- `clusterArgs`: list of arguments that will be passed to `clusterArgs` argument of `mainClustering` to allow control of the clustering
- `whenUnassign`: when to forcibly set unassigned to -1 for those. option `whenUnassign="after"` corresponds to previous version, but now default is `before`, i.e. to cluster without those samples, so those samples don't affect the clustering of those that will get assignments.
- The matrix version of `makeConsensus` no longer returns both the final clustering and the clustering from unassigning samples with large proportion of -1 values. It now only returns the final clustering.
* Changed `coClustering` slot to not only take full NxN matrix, but also
- index of clusterings used (from `clusterMatrix` of object)
- NxB matrix of clusterings (also in sparse format)
- `plotCoClustering` now calculates the hamming distance of clusterings; if requested, will return object with distance to avoid having to calculate in future plots (meaning the NxN matrix must be saved, in sparse format).
* Added checks in builtin functions for very small numbers of observations (mainly for unit tests)
* `addClusterings` applied to two CE objects now gives more arguments about how will copy over the clustering information of one to the other.
* For sequential clustering, added option to make `top.can` be fraction between 0 and 1 indicating keep all candidate clusterings that have at least this proportion (of the remaining data) in the clustering. Default is now 0.01 (only clusters with at least 1% of clustered data can be considered as stable over different choices).
Bugs:
* Export Function`getReducedData`
* Fix error in `makeConsensus`: was calculating the unassigned proportion in the wrong margin and proportion=1 correctly (matrix inversion problem).
* Fix bug in `clusterMany` that was not grabbing correct distFunction results.
* Set default of `clusterMatrixColors` for argument `whichClusters` to be "all"
Changes in version 2.7.2 (Release date: April 9, 2019)
==============
Bugs:
* Fix error from R 4.0 changing definition of treatment of NULL ("For consistency, N <- NULL; N[[1]] <- val now turns N into a list also when val) has length one. This enables dimnames(r1)[[1]] <- "R1" for a 1-row matrix r1, fixing PR#17719 reported by Serguei Sokol."). Replaced all `<- c()` with `<- vector(...)` and the appropriate class.
Changes in version 2.7.1 (Release date: Nov 5, 2019)
==============
Bugs:
* Fix error in making sample dendrogram
Changes in version 2.5.7 (Release date: October 22, 2019)
==============
Bugs:
* Fix logical precedence error in C++ code for subsample loop
Changes in version 2.5.3 (Release date: May 29, 2019)
==============
Changes:
* Made "kmeans" the default in subsampling if data type is "X"
* Set checkDiss=FALSE by default for most all functions (exception is when user defines distance function)
* Add warnings when forced to calculate a nxn distance matrix
* Improve generic classification to centroids (used by "pam") so not calculate unnecessary distances.
* Added mbkmeans as built-in cluster function
Bugs:
* Remove bug in built-in cluster function "kmeans" so as to not make unnecessary nxn distance matrix.
Changes in version 2.3.0( Release date: May 3, 2019)
==============
Changes:
* MAJOR CHANGE TO CLASS (1): the dendrogram is now stored in the object as the S4 class `phylo4d` in the package `phylobase`. This allows easier manipulation of the dendrogram by the user (including user-defined dendrograms), as well as allowing storage of information about dendrogram nodes that the programs can make use of. The slot "dendro_outbranch" has been removed because this information is now stored in the `phylo4d` object.
* MAJOR CHANGE TO CLASS (2): the `coClustering` is now stored in the object as a sparse matrix of the S4 class `Matrix` from the package `Matrix`
* New functions:
- `sampleDendrogram`, `clusterDendrogram`, `nodeLabels` to access and manipulate the stored dendrograms
- `numericalAsCharacter` to allow create two digit characters from integers for better sorting of character versions of integers.
- `getClusterIndex` function used to match argument `whichClusters` to indices of the `clusterMatrix` (previously was not exported or documented)
* Added arguments to the following functions:
- `whichClusters` to `getBestFeatures`
- `calculateSample` to `makeDendrogram`
- `plotType` to `plotClustersTable` (now allows "bubble" version of plot of overlap ), and many other arguments related to controlling the plotting.
- `labelTracks` to `plotHeatmap` to control whether to add labels to colored tracks.
* Changed `makeDendrogram` so that the samples dendrogram is made faster. Previously was made by the (time-consuming) process of running `hclust` with "fake data" where each sample was replaced with the mediod of cluster. Now the sample dendrogram is created by manually pasting a (arbitrary) binary tree of the samples with zero-length edges onto the cluster's hierarchical tree.
* Cluster names like `m1`, `m2` given by `mergeClusters` or `c1`, `c2` by `makeConsensus` will now be `m01`, `m02`, etc. This will help in the sorting correctly by cluster names. If needed (i.e. there are more than 100 clusters) the numbers will be augmented to `m001`, more than 1000 to `m0001` (see new function `numericalAsCharacter`)
* The argument `replaceCoClustering` in `clusterSingle` has been changed to `saveSubsamplingMatrix` and the default behavior of when the subsampling matrix is saved and returned has been modified. See `clusterSingle`.
* `plotClustersTable` now allows that the plotted value be Jaccard Similarity.
* `getBestFeatures` now returns `ContrastName` and `InternalName`, where `ContrastName` is now parsed to match the user-defined cluster/node names, while `InternalName` uses the internal, fixed names for clusters/nodes. (Previous versions returned a data.frame with the column `ContrastName` using only the internal fixed names and there was no column `InternalName`).
Bugs:
* Fixed bug in `tableClusters` that prevented option `useNames` from being passed if `whichClusters` argument was not given.
Changes in version 2.1.5 ( Release date: 2018-06-28)
==============
Changes:
* Add functionality to `getBestFeatures` to allow `edgeR` for DE, as well as weights used with `edgeR` for `zinbwave` compatability. As part of this change:
- Removed `isCount` argument and replaced with more fine-grained `DEMethod` argument in `getBestFeatures`, `mergeClusters`; or `mergeDEMethod` in `RSEC`.
- *Change to class definition*: added slot `merge_demethod` to keep track of the DE method used in merging
* Change function names (old function name is now depricated):
- `combineMany` -> `makeConsensus`
- `removeUnclustered` -> `removeUnassigned`
* Arguments to functions changed:
- `combineProportion` -> `consensusProportion` in `RSEC`
- `combineMinSize` -> `consensusMinSize` in `RSEC`
- `sampleData` -> `colData` to match `SummarizedExperiment` syntax (in many plotting functions).
- `alignSampleData` -> `alignColData` in `plotHeatmap`
- `ignoreUnassignedVar` -> `filterIgnoresUnassigned` in `mergeClusters` (and other functions) for clarity.
- `removeNegative` -> `removeUnassigned` in `getBestFeatures` for uniformity
- Removed `largeDataset` option to `subsampleClustering` because no longer provides advantage.
- `nBlank` -> `nBlankFeatures` in `makeBlankData` to allow for samples
* Created functions:
- `primaryClusterLabel` and `primaryClusterType`
- `getReducedData`
- `assignUnassigned`: assigns unassigned samples to nearest cluster
- `renameClusters` and `recolorClusters`: assign new names/colors to clusters within a particular clustering
- `clusterMatrixColors`: wrapper to `convertClusterLegend` to return matrix like `clusterMatrix` only with colors in place of the internal cluster ids (like existing `clusterMatrixNamed`)
- `plotClustersTable` for plotting a heatmap showing the results of `tableClusters`
- `subsetByCluster` for subsetting CE object to only those samples in a particular cluster(s) of a clustering.
- `plotFeatureScatter` for a scatter plot of 2+ features (genes) colored by cluster
- `addToColData` and `colDataClusters` adding clustering information to colData of object. `addToColData` returns object with `colData` augmented, while `colDataClusters` just returns the `DataFrame` with clusterings added.
- `updateObject` to update historical object created from previous versions to the current class definitions.
* Added arguments:
- `whichAssay` to all functions to allow the user to select the assay on which the operations will be performed.
- `stopOnErrors` to `RSEC`
- `nColLegend` to `plotReducedDims`
- `subsample` and `sequential` to `RSEC` to allow for opting out of those options (but default is `TRUE` unlike `clusterMany`)
- `nBlankSamples` and `groupsOfSamples` to `makeBlankData` to allow for separating samples (columns)
- `add` and `location` to `plotClusterLegend`
* `makeBlankData` will now allow for making blank columns to separate groups of samples.
* `plotDendrogram` now allows for plotting of `colData` (previously `sampleData`) like `plotHeatmap` or `plotClusters`
* `clusterMany` now allows user-defined `ClusterFunction` objects to argument `clusterFunction`.
* Removed restriction in `plotClustersWorkflow` that only `clusterType="clusterMany"` allowed.
* Allow `getClusterManyParams` to search old `clusterMany` runs as well.
* Added `table` method to `plotHeatmap` (for plotting heatmap of results of `table` function)
* Added error catch if try to give argument `whichCluster` to `mergeCluster`.
* Added error catch if give param `whichClusters` to functions that only take `whichCluster` (singular) as an argument
* `plotFeatureBoxplot` now returns (invisibly) the colors and clusterIds along with the boxplot information.
Bugs:
* Add check for merge_nodeMerge table that `mergeClusterId` column can't be NA for entries where `isMerged=TRUE`
* Fix internal .makeIntegerClusters so that if given values `1:K` for input clustering will retain these same values (Issue #227)
* `mergeClusters` now returned object saves the merge information (and deletes old info and updates clusterType/clusterLabel of existing merge clusters), even if `mergeMethod="none"`.
* Fix `removeClusterings` so doesn't loose merge info unless deleting relevant clusterings.
Changes in version 2.1.4 ( Release date: 2018-06-27)
==============
Bugs:
* Fix tests that fail in devel version of Bioconductor (due to changes in other packages)
Changes in version 2.1.3 ( Release date: 2018-05-24)
==============
Bugs:
* Fix bug in how `clusterMany` and `defaultNDims` dealt with filtering choices in `reduceMethod`.
* Fixed bug in how `clusterMany` assigned label names
* Fixed bug so that `clusterMany` labels are increased (iteration version added) if `clusterMany` is rerun. Also, user-defined labels for functions like `mergeClusters` are now updated with iteration value if they are duplicated.
Changes in version 2.1.2 ( Release date: 2018-05-17)
==============
Bugs:
* Fix so that estimates of the proportion non-null in `mergeClusters` are always positive.
* Fix so that calculation of filter with `ignoreUnassignedVar` doesn't delete existing base filter of same type.
* Fix `plotClustersWorkflow` where wrong cluster was grabbed to plot.
* Fix bug in `plotDendrogram` where colors of clusters plotted with could be subsumbed by color of previous clustering. (git Issue `#220`)
Changes in version 2.1.1 ( Release date: 2018-05-15)
==============
Bugs:
* Fixed error introduced in 2.0.0 where arguments to `mergeCutoff` and `mergeLogFCcutoff` were passed instead to `dendroNDims`.
Changes in version 1.99.4 ( Release date: 2018-04-19)
==============
Changes:
* Added support for hdf5 files stored in `assay` slot via the `HDF5Array` package
* Removed most defaults from `RSEC` arguments -- pull them from underlying functions' defaults.
* replace `1:` with `seq_len` and `seq_along`
Changes in version 1.99.3 ( Release date: 2018-04-17)
==============
Changes:
* Re-implemented subsampleClustering() and combineMany() to use C++.
* Method "adjP" in `mergeClusters` now allows for further requirement that gene have a minimal log-fold change ('logFCcutoff').
Bugs:
* Fix bug in `setBreaks` (isPositive and isNegative variables)
* use `stringr::str_sort` to make sort of character values locale independent
Changes in version 1.99.2 ( Release date: 2018-03-22)
==============
Bugs:
* Fix `defaultNDims` so that returns minimum of 50 and the minimum dimension of data.
* Fix `RSEC` so still returns results if hit error after clusterMany.
Changes in version 1.99.0 ( Release date: 2018-02-15)
==============
Changes:
* MAJOR CHANGE TO DEFINITION OF CLASS: This version consists of a major update of how dimensionality reduction and filtering is done. The class has been updated to extend the new `SingleCellExperiment` class, which save the dimensionality reductions. Furthermore, calculating of per-gene statistics, which are usually used for filtering, are stored in `rowData` of the object and can be easily accessed and used for repeated filtering without recalculating. This has created a massive change under-the-hood in functions that allow dimensionality reduction and filtering. Changes to function names are the following:
- `transform` is now `transformData`
- New functions `makeReducedDims` and `makeFilterStats` will calculate (and thus store) dimensionality reductions and statistics for filtering the data.
- New function `filterData` will return the filtered data as a matrix
- New functions `listBuiltInReducedDims` and `listBuiltInFilterStats` give the list of currently available functions for dimensionality reduction and filtering statistics, respectively.
- Filtering on arbitrary statistics and user-defined dimensionality reduction can used in `clusterMany` and related functions, so long as they are saved in the appropriate slots of the object.
* Changed the following functions/arguments to be consistent with SingleCellExperiment naming conventions and improve distinction between terminology of cluster and clustering.
- Capitalized constructor functions. Now: `ClusterFunction()` and `ClusterExperiment()`
- `nPCADims` now changed to `nReducedDims` in clusterMany-related functions
- `nVarDims` now changed to `nFilterDims` in clusterMany-related functions
- `dimReduce` argument now changed to `reduceMethod` across functions
- `ndims` to `nDims` in `clusterSingle` and `makeDendrogram` to keep consistency.
- `plotDimReduce` to `plotReducedDims`
- Changed `nClusters` to `nClusterings` to better indicate purpose of function. `nClusters` now gives the number of clusters per clustering.
- `addClusters` to `addClusterings` and `removeClusters` to `removeClusterings`. New function `removeClusters` allows the user to actually ``remove" a cluster or clusters from a clustering by assigning samples in those clusters to `-1` value.
- `clusterInfo()` to `clusteringInfo()`
* In addition these structural changes, the following enhancements are also included in this release
- New function `plotClusterLegend` that will plot a legend for a clustering.
- Color definition changes: `showBigPalette` has been replaced with `showPalette` and now can show any palette of colors. Adjusted color definitions of `seqPal2` and `seqPal4` to be completely symmetric around center. The colors in `bigPalette` have been changed and shuffled to reduce similar colors and `massivePalette` has been created by adding all of the non-grey colors (in random order) from `colors()` so that `plotClusters` will not run out of colors.
- `getClusterManyParams`: now uses saved `clusterInfo` rather than more fragile `clusterLabels` to get parameters. The resulting output is formatted somewhat differently.
- `ClusterExperiment`: removed `transformation` as a required argument. Now sets with default of `function(x){x}`. Allows argument `clusterLegend` to define the clusterLegend slot in the constructor.
- `plotClustersWorkflow`: Argument `existingColors` in now takes arguments `ignore`,`all`,`highlightOnly` similar to `plotClusters`
- `plotDendrogram`: Argument `nodeColors` now available. Changed defaults so default is to do colorblock of samples.
- `plotContrastHeatmap`: Argument `contrastColors` now available to assign colors to the contrasts. Genes are now ordered by fold-change within each contrast.
- `plotClusters`: argument `existingColors` now allows for the option `firstOnly`
- `makeDendrogram`: now allows option 'coCluster' to the argument `reduceMethod` indicating use of the coClustering matrix to build the dendrogram. `makeDendrogram` now also has a method for building a dendrogram from an arbitrary distance function
- `clusterMatrix`: now returns cluster matrix with rownames corresponding to sample names.
- `convertClusterLegend`: now takes argument `whichClusters`
Bugs:
* converted automatic assignment of colors in `clusterLegend` to be based on `massivePalette` so won't run out on toy examples.
* fixed minor bugs in `plotHeatmap` so that will
- handle factor with only one value in annotation
- will plot annotation labels when there is `NA` in the annotation
- no longer calls internal function `NMF:::vplayout` in making those labels, more robust
* fixed bug in how `plotClustersWorkflow` handled existing colors.
* Fixed so `diss` now passed to subsampling in calls to clusterSingle/clusterMany
* Fixed so `plotClusters` now will not give incomprehensible error if given duplicates of a color
* Fixed `plotDendrogram` so will not create blank plot.
Changes in version 1.3.8 ( Release date: 2017-10-24)
==============
Changes:
* Updates to documentation
Changes in version 1.3.7 ( Release date: 2017-10-24)
==============
Changes:
* Added function `tableClusters` for tabling clusters by name.
* Added largeDataset option to subsampleClustering
* allow `mergeInfo` to be called in `plotDendrogram` to use stored merge info in plotting.
Bugs:
* Fixed bug in .makeMergeDendrogram in getFeatures
Changes in version 1.3.6 ( Release date: 2017-10-18)
==============
Changes:
* MAJOR CHANGE TO DEFINITION OF CLASS: Added slots to `ClusterExperiment` object so that the object keeps the information about the merging, and added corresponding helper functions (see documentation). This means previous versions made of a `ClusterExperiment` object will no longer be valid objects! Old, saved objects from previous versions must be manually adapted to have these slots (with `NULL` or `NA` values as appropriate).
* Add function `plotClustersWorkflow`, see documentation
* Add function `plotDimReduce` for plotting low-dim pca with points labeled by cluster.
* Add function `plotContrastHeatmap` to plot the significant genes that are result of getBestFeatures
* `getBestFeatures`: can now be run on result of mergeClusters without having to call makeDendrogram again for the merged clusters.
* Change argument to `plotClusters` and `plotBarplot` from `clusters` to `object`
* `makeDendrogram`: default in is now `dimReduce="mad"` to avoid accidentally calling it with `dimReduce="none"` (previous default), which can kill interactive R sessions.
* Changes to `plotHeatmap`:
- default in `plotHeatmap` to `clusterSamplesData="dendrogramValue"`.
- Changed handling of `clusterSamplesData` argument in `plotHeatmap` so that if argument equals `dendrogramValue` or `primaryCluster` and gives bad results / errors, will give warning and change argument appropriately.
- Added arguments to `plotHeatmap` allowing more user control regarding making blank lines for when have gene groupings.
* `mergeClusters` now aligns the colors from `mergeClusters` and `combineMany` internally.
* `mergeClusters` now suppresses warnings created by the estimation of the percentage non-null unless `showWarnings=TRUE`.
* `plotDendrogram`:
- has new argument `clusterLabelAngle` allowing user to change the angle of the clusterLabel printed on top (when plot is of type "colorblock")
- argument `labelType` has been changed to `plotType`
Bugs:
* Arguments passed to `mergeClusters`, ClusterExperiment version, via the `...` command will now go first to the `mergeClusters` matrix version and then onto the plotting, as stated in documentation (plotting arguments were being ignored on the clusterExperiment version of the function).
Changes in version 1.3.4 ( Release date: 2017-09-28)
==============
Changes:
* Add argument `clusterLabels` to `plotClusters` to allow user to set clusterLabels without changing the object.
* Added data object of run of RSEC. Changed `lazyLoad` to `false` because loading this object on installation was creating errors.
* Updated vignette to more completely cover RSEC.
* Change default `clusterFunction` argument to `RSEC` to be "hierarchical01" rather than all 01 methods (previous default). This reduces load of making simple call.
* Updated validity checks to reduce memory load, and dropped `validObject` calls.
* Added function `getClusterManyParams` to parse the parameter values in the clusterLabels from clusterMany
* Removed old dependency on diagram (from previous vignette, no longer needed)
Bugs:
* Fixed error in subsetting of clusterExperiment object when dendrogram is attached (previously didn't reset dendro_outbranch to NA)
Changes in version 1.3.3 ( Release date: 2017-09-07)
==============
Changes:
* Bug fix in clusterContrasts -- missing `match.arg` option for `outputType` argument
Changes in version 1.3.2 ( Release date: 2017-07-05)
==============
Changes:
* Default for `top.can` in seqCluster are changed to be `top.can=5`.
* makeDendrogram now has the default argument `ignoreUnassignedVar=TRUE` like in RSEC
* add ClusterFunction class and update all functions to work on this. All built in cluster functions are now given ClusterFunction Objects, so all built in clustering functions can now work for either `subsampleClustering` or `mainClustering`. This will also make it easier for a user to define their own ClusterFunction object and have it be used by functions like `clusterSingle`. This is a major change in how some of the underlying functions work, but should not impact common functions like `clusterMany` and `RSEC`. Some of the more notable changes in the arguments for programmers are:
- `clusterD` and `clusterDArgs` have been changed to `mainClustering` and `mainClusterArgs`. This change was made to make these arguments more clear as to their role in the clustering workflow (and because the clusterD refered to clustering a dissimilarity but it has clustered either x or D for many versions now. )
- `seqCluster` and `clusterSingle` no longer take the argument `clusterFunction`. `clusterFunction` must be given via `mainClusterArgs` and `subsampleArgs` to be passed to `mainClustering` or `subsamplingCluster`, respectively. Now only the upper-level function `clusterMany` takes `clusterFunction` directly as an argument.
- `mainClustering` (previously `clusterD`) and `subsampleClustering` no longer take `k` nor `alpha` as a direct argument. These arguments, like all arguments used directly by the cluster function, now need to be passed to the clustering function in a list via `clusterArgs`.
- The list of available built-in clustering routines provided by the package can now be accessed via `listBuiltInFunctions()`. The functions that are used for these functions are now available to the user via their ClusterFunction object that the user can access with the function `getBuiltInFunction`. (see ?listBuiltInFunctions)
* `hiearchical01` clustering now has a different default, namely to apply `as.dist` to the input `diss` in order to get a `dist` object, rather than `dist(1-diss)` which was previously the default for historical reasons. This is controlled by argument `whichHierDist`, and can be set to the previous version by passing `whichHierDist="dist"` to the `clusterArgs` argument in either `subsampleArgs` or `mainClusterArgs`, depending on where `hierarchical01` is being used.
* Spectral clustering is now available (`"spectral"`) via the `specc` function of `kernlab`.
* `clusterSingle` now only returns the dissimilarity matrix in the `coClustering` slot if `subsample=TRUE` in the call. Furthermore, for the resulting dissimilarity to replace an existing `coClustering` slot value, the user must request it by setting `replaceCoClustering=TRUE` in the call to `clusterSingle`.
* Removed default value for argument `proportion` in `combineMany`. The previous default was `proportion=1` but didn't match most common use cases and was accidentally called by upper functions like RSEC.
* If the `clusterFunction` argument is not given to `subsampleArgs` by the user explicitly, and the `clusterFunction` of `mainClusterArgs` is appropriate, it will be used for `subsampleClustering`; if the `clusterFunction` in `mainClusterArgs` is not appropriate (e.g. `subsampleClustering` needs a type `K` because `sequential=TRUE`), then the default for the `subsampleClustering` will be `'pam'`. This changes the previous behavior of `subsampleClustering` where the default was 'pam' in all cases where not explicitly given by the user. This change should have no impact on RSEC: since the `clusterFunction` for the `mainClustering` terms is a `'01'` type in RSEC and the `subsampleClustering` has to be type `'K'` when `sequential=TRUE`, it will revert to the default `"pam"` as before.
Bugs:
* Fixed error so where if `clusterSingle` was called on existing clusterExperiment object it would overwrite the information of the existing `clusterExperiment` object.
* Fixed `RSEC` so now if rerun on existing `clusterExperiment` object, it will grab defaults from the matrix version (previously defaults were those of the underlying function, which were not always the same, e.g. `combineProportion` default was previously 1)
* Fixed `clusterMany` so now it explicitly sets `dimReduce="none"` in call to `clusterSingle`. Before, might have been calling all of the `dimReduce` defaults (i.e. all of them!).
* Fixed so gives error if whichClusters in `plotBarplot` doesn't match anything.
Changes in version 1.3.1 ( Release date: 2017-06-14 )
=======
Changes:
* change how `plotHeatmap` handles visualizeData argument, so not required to have same number of genes as original, only same number of samples.
* Now if color of vectors given in `clusterLegend` does not have names, `plotHeatmap` will give them names matching the variable so that they will be used by `aheatmap` (previously would have left all colors white because do not have matching names).
* Large changes to how dendrograms are plotted by `plotDendrogram` and `mergeClusters`. This includes the ability to see the before and after clusterings along side the mergeClusters result, as well as a new slot added to the clusterExperiment class (`dendro_outbranch`). The names of several arguments to `mergeClusters` and `plotDendrogram` were changed for clarity:
- `leaves` is now `leafType` in `plotDendrogram`.
- `plotType` is now `plotInfo` in `mergeClusters`
- `doPlot` is now `plot` in `mergeClusters`
- `leafType` is now an option for `mergeClusters` as well.
- Now when `plotInfo` (previously `plotType`) is set to `none`, the plot is still drawn, but just no information about the merging is added to the plot. To not plot the dendrogram at all, set `plot=FALSE`.
- The option `labelType` in either `plotDendrogram` or `mergeClusters` controls whether names (`name`) or rectangular color blocks corresponding to the cluster (`colorblock`) are put at the tips of the dendrogram to label the clusters/samples.
* added `dendroClusterIndex` that behaves similarly to `primaryClusterIndex`
* added ability to give `dendro` as charater option to `whichClusters` argument
* added `transformation<-` to be able to assign manually the transformation slot
* Move MAST into 'suggests' pacakge so that not need R 3.4 to run the package.
* Change calculation of PCA dimensionality reduction to use `svds` from `RSpectra` package to improve speed
Bugs:
* Fixed bug in RSEC where `combineProportion` argument was being ignored (set to 1)
* Fixed bug in definition of `transform` so that extends existing generic rather than masking it.
Changes in version 1.3.0 ( Release date: 2017-05-24 )
==============
Changes:
* `plotHeatmap` accepts `data.frame` or `ExpressionSet` objects for the data argument (calls `data.matrix` or `exprs` on object and sends to matrix version)
* Added `plotBarplot` to plot a barplot for 1 cluster or comparison of 2 clusters along with tests.
* Added `whichClusters` argument to `clusterMatrix` to return only clusters corresponding to certain clusters. Mainly relevant for using arguments like `workflow` that are used by other commands (otherwise could just index the complete matrix manually...)
Bug fixes:
* `plotHeatmap` now goes through the `clusterLegend` input and removes levels that do not exist in the sampleData; this was causing incorrect coloring when the `clusterLegend` had more (or less) levels that it assigned color to than the `sampleData` did (e.g. if `sampleData` was a subset of larger dataset upon which the original colors were assigned.) NOTE: that this now has the effect of NOT plotting all values in the clusterLegend if they are not represented in the data, thus changing the previous behavior of `plotHeatmap` legend.
* fixed bug in how `plotHeatmap` checked that the dimensions of user-supplied dendrogram match that of data (matrix version).
* fixed `convertClusterLegend` so when `output` is `matrixNames` or `matrixColors`, the resulting matrix has the `colnames` equal to cluster labels, like `clusterMatrix`.
* internal function .convertToNum now preserves names of input vector.
* fixed bug in plotting with merge clusters; previously if plotType="all", might not have been correctly plotted with the right internal node of the dendrogram.
Changes in version 1.2.0 ( Release date: 2017-04-04 )
==============
Changes:
* RSEC now has option `rerunClusterMany`, which if FALSE will not rerun the clusterMany step if RSEC is called on an existing clusterExperiment object (assuming of course, clusterMany has been run already on the object)
* setBreaks now has option `makeSymmetric` to force symmetric breaks around zero when using the quantile option.
* setBreaks now has a default for breaks (i.e. for minimal use, the user doesn't have to give the argument, just the data) in which case setBreaks will automatically find equal-spaced breaks of length 52 filling the range of data compatible with aheatmap. The order of the arguments `data` and `breaks` has been switched, however, to better accomodate this usage.
* plotClusters can now handle NA values in the colData
* plotClusters for `clusterExperiment` object now allows for setting `sampleData=TRUE` to indicate the plotting all of the sampleData in the colData slot.
* nPCADims now allows values between 0,1 to allow for keeping *proportion* of variance explained.
* addClusters now allows for argument `clusterLabel` to assign a clusterLabel when the added cluster is a vector (if matrix, then clusterLabel is just the column names of the matrix of cluster assignments)
Bug fixes:
* fixed bug in clusterExperiment subsetting to deal with orderSamples correctly.
* fixed bug in mergeClusters unable to plot when too big of edge lengths (same as plotDendrogram)
* fixed bug in subsetting, where unable to subset samples by character
* fixed bug in removeClusters so that correctly updates dendro_index and primary_index slots after cluster removed.
Changes in version 1.1.1 (Release date: 2016-10-14 )
==============
Changes:
* Inverted definition of contrast for one-versus-all so now is X-ave(all but X); this means logFC positive -> cluster X greater than average of rest
Bug fixes:
* add check in clusterMany that non-zero dimensions
* changed 'warning' to 'note' in combineMany when no clusters specified.
* fixed bug in plotDendrogram unable to plot when makeDendrogram used dimReduce="mad"
* fixed bug in clusterMany where beta set to NA if clusteringFunction is type K. Added check that beta cannot be NA.
* Added check that alpha and beta in (0,1)
Changes in version 0.99.3 (Release date: 2016-07-26)
==============
Changes:
* plot in mergeClusters now uses cluster names and colors from clusterLegend
* plotDendrogram now calls plot.phylo
* add 'clusterLabel' argument to `clusterSingle`
* add options 'mad' and 'cv' to the dimensionality reduction. Also made option to only use clustered samples for feature reduction for relevant functions (e.g. `makeDendrogram`).
* clusterSingle now always returns the D matrix to the slot coClustering (previously only did so if D was from subsampling).
* change so that clusterSingle takes dissimilarity matrix, and now clusterMany calculates dissimilarities up front (rather than recalculating each time)
* add RSEC function for wrapper that leads to RSEC algorithm.
* add test for clusterMany to make sure replicable with past results (not unit test because too long to run, so not part of R build)
Bug fixes:
* fix bug in .TypeIntoIndices so that handles mix of clusterType and clusterLabels in whichClusters
* fixed bug in plotCoClustering so handles clusterSamplesData
* D for clusterD is now distance, not similarity, for 0-1, meaning larger values are values that are less similar.
* fix bug in plotClusters that would give clusterLegend entries that were vectors, not matrices.
Changes in version 0.99.1 (Release date: 2016-05-24 )
==============
Changes:
* changes to pass development version of bioConductor checks.
Changes in version 0.99.0 (Release date: 2016-05-24 )
==============
Changes:
* changed number to indicate bioconductor submission
Changes in version 0.2.0 (Release date: 2016-05-10 )
==============
Changes:
* Allow 'whichCluster'/'whichClusters' arguments to match to clusterLabels, not just clusterTypes
* Added slot 'dendro_index'
* Added 'whichCluster' argument to `makeDendrogram`
* Added 'hierarchicalK' clustering
* Added default distance for 0-1 clustering
* Added ability to define distance for clustering
* Added 'setToCurrent' and 'setToFinal' options to update status of a cluster.
* Added unit tests for workflow function (in test_constructor)
* 'getBestFeatures' now calls 'clusterContrasts' internally
* Output for 'clusterContrasts' changed
* Removed 'Index' output for getBestFeatures
* Changed tests for getBestFeatures to run on standard objects (which means now have -2 values to test against)
* User can now give clusterLabel for resulting cluster of combineMany and mergeClusters
Changes in version 0.1.0 (Release date: 2016-05-04)
==============
Changes:
* Conversion to S4 language for bioConductor submission
* All previous functions have been overhauled, renamed, etc.
Changes in version 0.0.0.9006
==============
Changes:
* fixed so that mergeClusters, clusterHclust, and getBestFeatures will appropriately convert if the input of clustering vector is a factor rather than numeric (with warning).
* fixed mergeClusters to have option to indicate that input matrix is a count matrix (in which case will create dendrogram with log(counts+1) and will do getBestFeatures with the voom correction)
* added more tutoral-oriented vignette (old vignette is now the documentation vignette with more detail about the internal workings of package). Currently is just simulated data, but will be updated to real single-cell sequencing dataset.
Changes in version 0.0.0.9005
==============
Changes:
* Changed simulated data so load all with data(simData) rather than separate calls for simData and simCount. Also added 'trueCluster' vector to give true cluster assignments of simulated data
* added dendro example to getBestFeatures
* added example to clusterHclust
* added single function for converting to phylobase tree (used internally by package)
* added functionality to find proportion of significant null hypotheses for merging clusters (mergeClusters)
Changes in version 0.0.0.9004
==============
Changes:
* Changed clusterMany.R to only set k<-NA if sequential=FALSE (previously for all where findBestK=TRUE)
* Added to vignette
* fixed bug in plotClusters to correctly plot "-1"