Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement DBSCAN/OPTICS as an mz_group option? #34

Open
wkumler opened this issue Mar 5, 2024 · 1 comment
Open

Implement DBSCAN/OPTICS as an mz_group option? #34

wkumler opened this issue Mar 5, 2024 · 1 comment
Labels
enhancement New feature or request good first issue Good for newcomers

Comments

@wkumler
Copy link
Owner

wkumler commented Mar 5, 2024

Realized today that m/z group construction could be done with a 1D density-based clustering algorithm like DBSCAN or OPTICS. Perks of this would be that the "hard" m/z window currently used by mz_group would be relaxed and could be determined in a more data-driven method.

There's a paper about this exact idea: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3982975/ and they talk about reducing the computational constraints through some clever preprocessing, necessary because the current implementation takes a long while for just 6 files.

Quick proof-of-concept:

library(RaMS)
ms_filedir <- system.file("extdata", package="RaMS")
ms_files <- list.files(ms_filedir, pattern="LB.*mzML", full.names=TRUE)
msdata <- grabMSdata(ms_files)

library(dbscan)
mz_groups <- dbscan(msdata$MS1[,"mz"], eps = 0.0001, minPts = 100)
msdata$MS1$mz_group <- mz_groups$cluster

library(ggplot2)
msdata$MS1[mz%between%c(110, 130)] %>%
  ggplot() +
  geom_point(aes(x=rt, y=mz, color=factor(mz_group)))
@wkumler
Copy link
Owner Author

wkumler commented Mar 5, 2024

One big perk of this method is that it would identify/remove a bunch of the "noise" data points that are singular points instead of having to assign them each to an m/z group. min_group_size already kinda does this but not very well(?)

@wkumler wkumler added enhancement New feature or request good first issue Good for newcomers labels Sep 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

1 participant