Skip to content

Latest commit

 

History

History
184 lines (111 loc) · 11.4 KB

File metadata and controls

184 lines (111 loc) · 11.4 KB

Awesome Multiple Hypothesis Testing Awesome

Extensive collection of resources on the topic of multiple hypothesis testing.


1 Publications

1.1 Philosophical

1.2 Introductory

1.3 Seminal: Static

Bonnferroni-Correction

Details

Algorithm for controlling the FWER in (static) hypothesis testing. The adjusted threshold $\alpha_i$ for $k$ tested hypotheses is calculated as:

$$\alpha_i = \frac{\alpha}{k}$$

Details

Algorithm for controlling the FDR in (static) hypothesis testing for p-values that are independent or with positive regression dependency on subsets:

  • Given $\alpha$, sort all p-values $P_k$ and find the largest $k$ for $P_k \leq \frac{k}{m} \alpha$.
  • Reject $\mathcal{H}_0$ for all $H_i$ for $i=1, 2, \ldots, k$.

[BenjaminiHochberg1995]

Details

Algorithm for controlling the FDR in (static) hypothesis testing for p-values under arbitrary dependence. This modifies the threshold as obtained by Benjamini-Hochberg Procedure [BenjaminiYekutieli2001] as follows:

$$P_k \leq \frac{k}{m c(m)} \alpha$$

  • The standard Benjamini-Hochberg Procedure can be recovered by $c(m)=1$ for independent or positively correlated p-values.
  • Under arbitrary dependence $c(m)$ is defined as the Harmonic number $c(m)=\sum^{m}{i=1}\frac{1}{i}$.

1.4 Seminal: Sequential

Serial estimate of the Alpha Fraction that is Futilely Rationed On true Null hypotheses. [RamdasZrnic2018]

Details

Algorithm for controlling FDR in sequential (online) hypothesis testing for independent p-values that was proposed by [RamdasZrnic2018].

SAFFRON estimates the proportion of $\mathcal{H}_0$, i.e. adjusts the test levels $\alpha_i$ based on an estimate of the amount of alpha wealth that is allocated to testing true $\mathcal{H}_0$. SAFFRON depends on the constants $w_0$ and $\lambda$, with $w_0$ as the initial alpha wealth, satisfying $0 \leq w_0 \leq \alpha$. The parameter $\lambda \in (0,1)$ defines the threshold for a candidate as SAFFRON never rejects p-values $\geq \lambda$. Candidates are hypotheses that are more likely to be discoveries:

  • At each time $t$, define the number of candidates after the j-th rejection as

$C_{j+} = C_{j+}(t) = \sum_{i = \tau_j + 1}^{t-1} C_i$

with $C_t = 1{p_t \leq \lambda }$ as the indicator for candidacy.

  • Subsequent test levels are chosen as $\alpha_t = \min{ \lambda, \tilde{\alpha}_t}$ with the exception

$\alpha_1 = \min\{(1 - \lambda)\gamma_1 w_0, \lambda\}$

and subsequent

$\tilde{\alpha}_t = (1 - \lambda) [w_0 \gamma_{t-C_{0+}} + (\alpha - w_0)\gamma_{t-\tau_1-C_{1+}} + \alpha \sum_{j \geq 2} \gamma_{t - \tau_j- C_{j+}}]$

Typically, $\gamma_j \propto j^{-1.6}$ is used as the $\gamma$ sequence.

An ADaptive algorithm that DIScards conservative nulls. [TianRamdas2019]

Details

Algorithm for controlling FDR in sequential (online) hypothesis testing for independent p-values that was proposed by [TianRamdas2019]. ADDIS iterates on SAFFRON by extending SAFFRONs adaptivity in the fraction of $\mathcal{H}_0$ by adaptivity in the conservativeness of $\mathcal{H}_0$. ADDIS depends on the constants $W_0$, $\lambda$ and $\tau$, with $W_0$ as the initial alpha wealth, satisfying $0 \leq w_0 \leq \alpha$. The new parameter $\tau \in (0,1]$ defines the threshold for discarding (conservative) p-values as p-values $\geq \tau$ are discarded (i.e. not considered for testing, with no wealth invested). As for SAFFRON, the parameter $\lambda \in [0,\tau)$ defines the threshold for candidates as ADDIS will never reject _p_values $\geq \lambda$.

$\alpha_t = \min\{\lambda, \tilde{\alpha}_t\}$

$\tilde{\alpha}_t = (\tau - \lambda)[w_0 \gamma_{S^t-C_{0+}} + (\alpha - w_0)\gamma_{S^t - \kappa_1^*-C_{1+}} + \alpha \sum_{j \geq 2} \gamma_{S^t - \kappa_j^* - C_{j+}}$

$\kappa_j = \min\{i \in [t-1] : \sum_{k \leq i} 1 \{p_k \leq \alpha_k\} \geq j\}, \kappa_j^* = \sum_{i \leq \kappa_j} 1 \{p_i \leq \tau \}, S^t = \sum_{i < t} 1 \{p_i \leq \tau \}, C_{j+} = \sum_{i = \kappa_j + 1}^{t-1} 1\{p_i \leq \lambda\}$

Typically, $\gamma_j \propto j^{-1.6}$ is used as the $\gamma$ sequence.

1.5 Seminal: Batching

Batch${\text{BH}}$ and Batch${\text{Storey-BH}}$

Interpolation algorithm between existing pure sequential (online) and static (offline) methods providing a trade-off between statistical power and temporal application. [Zrnic2020]

1.6 Applications and Modifications

1.6.1 Anomaly Detection


2 Presentations


3 Software Packages

R: onlineFDR [Robertson2019] Python: multipy [Puoliväli2020] Python: statsmodels [Seabold2010]

3.1 Repositories

3.2 Miscellaneous


4 References

[Dunn1961] Dunn, O. J. (1961). Multiple Comparisons Among Means. Journal of the American Statistical Association, 56(293), 52–64.

[Tuckey1991] Tukey, J. W. (1991). The Philosophy of Multiple Comparisons. Statistical Science, 6(1), 100-116.

[BenjaminiHochberg1995] Benjamini, Y., & Hochberg, Y. (1995). Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing. Journal of the Royal Statistical Society. Series B (Methodological), 57(1), 289–300.

[BenjaminiYekutieli2001] Benjamini, Y., & Hochberg, Y. (1995). Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing. Journal of the Royal Statistical Society. Series B (Methodological), 57(1), 289–300.

[Benjamini2002] Benjamini, Y., & Braun, H. (2002). John W. Tukey's contributions to multiple comparisons. The Annals of Statistics, 30(6), 1576-1594.

[Lee2018] Lee, S., & Lee, D. K. (2018). What is the proper way to apply the multiple comparison test?. Korean journal of anesthesiology, 71(5), 353–360.

[Robertson2023] Robertson, D. S., Wason, J. M. S., & Ramdas, A. (2023). Online multiple hypothesis testing. Statistical science : a review journal of the Institute of Mathematical Statistics, 38(4), 557–575.

[Robertson2019] Robertson DS, Liou L, Ramdas A, Karp NA (2022). onlineFDR: Online error control. R package 2.12.0.

[Puoliväli2020] Puoliväli T, Palva S, Palva JM (2020): Influence of multiple hypothesis testing on reproducibility in neuroimaging research: A simulation study and Python-based software. Journal of Neuroscience Methods 337:108654.

[Seabold2010] Seabold, Skipper, and Josef Perktold. “statsmodels: Econometric and statistical modeling with python.” Proceedings of the 9th Python in Science Conference. 2010.

[RamdasZrnic2018] Ramdas, A., Zrnic, T., Wainwright, M.J., & Jordan, M.I. (2018). SAFFRON: an adaptive algorithm for online control of the false discovery rate. International Conference on Machine Learning.

[TianRamdas2019] Tian, J., & Ramdas, A. (2019). ADDIS: An adaptive discarding algorithm for online FDR control with conservative nulls. In H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox, & R. Garnett (Eds.), Advances in Neural Information Processing Systems (Vol. 32). Curran Associates, Inc.

[Zrnic2020] Zrnic, T., Jiang, D., Ramdas, A., & Jordan, M.I. (2019). The Power of Batching in Multiple Hypothesis Testing. International Conference on Artificial Intelligence and Statistics.