-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Discover dependency tree for Darwin packages #61
Comments
yes exactly what I was thinking |
I messed around a bit to get the dependency hierarchy. It used the GitHub directly, so it updates when the repos change. getDescriptions <- function(org, name) {
res <- httr::GET(
url = sprintf("https://api.github.com/repos/%s/%s/contents/DESCRIPTION", org, name),
httr::accept_json(),
httr::add_headers(
"Accept" = "application/vnd.github.raw+json",
"Authorization" = sprintf("Bearer %s", Sys.getenv("GITHUB_PAT")),
"X-GitHub-Api-Version" = "2022-11-28"
)
)
content <- res |>
httr::content()
if (class(content) == "raw") {
text <- content |>
rawToChar() |>
strsplit(split = "\n") |>
unlist()
d <- tryCatch({
desc::desc(text = text)
}, error = function(e) {
return(NULL)
})
return(d)
} else if (class(content) == "list") {
if (content$status == "404") {
return(NULL)
} else {
message(content$status)
return(NULL)
}
} else {
warning("???")
return(NULL)
}
}
res <- httr::GET(
url = "https://api.github.com/orgs/darwin-eu-dev/repos?per_page=100",
httr::accept_json(),
httr::add_headers(
"Accept" = "application/vnd.github+json",
"Authorization" = sprintf("Bearer %s", Sys.getenv("GITHUB_PAT")),
"X-GitHub-Api-Version" = "2022-11-28"
)
)
repos <- httr::content(res)
descs <- lapply(repos, function(repo) {
getDescriptions(org = "darwin-eu-dev", name = repo$name)
})
descs <- descs[!unlist(lapply(descs, is.null))]
darwinPkgs <- lapply(descs, function(d) {
d$get_field("Package")
}) |>
unlist()
df <- lapply(descs, function(d) {
name <- d$get_field("Package")
deps <- d$get_deps()
deps <- deps[!deps$type %in% c("Depends", "Suggests"), ]["package"] |>
unlist() |>
as.character()
deps <- deps[deps %in% darwinPkgs]
if (length(deps) == 0) {
deps <- ""
}
data.frame(
pkg = name,
dependencies = deps
)
}) |>
do.call(what = "rbind")
df <- df[df$dependencies != "", ]
g <- igraph::graph_from_data_frame(df)
par(mar = c(0,0,0,0))
plot(g) Created on 2024-08-20 with reprex v2.1.1 |
Thank you! |
library(dplyr)
pkgs <- c(
"TreatmentPatterns",
"IncidencePrevalence",
"DrugUtilisation",
"CohortSurvival",
"DrugExposureDiagnostics",
"PatientProfiles",
"CohortCharacteristics",
"CDMConnector",
"omopgenerics"
)
repos <- lapply(pkgs, function(pkg) {
tempDir <- file.path(tempdir(), pkg)
if (!dir.exists(tempDir)) {
git2r::clone(
url = sprintf("https://github.com/darwin-eu-dev/%s.git", pkg),
local_path = tempDir,
credentials = git2r::cred_token()
)
} else {
message(sprintf("%s already exists, not cloning", tempDir))
}
PaRe::Repository$new(tempDir)
})
#> cloning into 'C:\Users\MVANKE~1\AppData\Local\Temp\RtmpaEQjDg/TreatmentPatterns'...
#> Receiving objects: 1% (74/7342), 63 kb
#> Receiving objects: 11% (808/7342), 456 kb
#> Receiving objects: 21% (1542/7342), 1024 kb
#> Receiving objects: 31% (2277/7342), 2977 kb
#> Receiving objects: 41% (3011/7342), 4930 kb
#> Receiving objects: 51% (3745/7342), 5211 kb
#> Receiving objects: 61% (4479/7342), 7452 kb
#> Receiving objects: 71% (5213/7342), 8460 kb
#> Receiving objects: 81% (5948/7342), 8629 kb
#> Receiving objects: 91% (6682/7342), 8741 kb
#> Receiving objects: 100% (7342/7342), 16255 kb, done.
#> ...
funcUse <- lapply(repos, function(repo) {
repo$getFunctionUse() %>%
dplyr::mutate(source_pkg = repo$getName())
}) %>%
dplyr::bind_rows()
#> Started on file: CDMInterface.R
#> Started on file: CharacterizationPlots.R
#> Started on file: computePathways.R
#> ...
df <- funcUse %>%
dplyr::filter(.data$pkg %in% pkgs) %>%
dplyr::mutate(call = sprintf("%s::%s", .data$pkg, .data$fun)) %>%
dplyr::select("source_pkg", "call")
g <- igraph::graph_from_data_frame(df)
par(mar = c(0,0,0,0))
plot(g) for (pkg in pkgs) {
print(pkg)
d <- df %>%
dplyr::filter(
.data$source_pkg == pkg,
!startsWith(.data$call, pkg)
)
g <- igraph::graph_from_data_frame(d)
par(mar = c(0,0,0,0))
plot(g)
}
#> [1] "TreatmentPatterns"
Created on 2024-08-22 with reprex v2.1.1 |
@mvankessel-EMC I like to revisit this topic with you and see how we can create some insight in Package Dependencies including some that are now in OHDSI organisation we include. |
Hi @mvankessel-EMC,
I'm interested to see the entire dependency graph for all Darwin packages. Basically I would like to take a look at a graph where each node is an exported function and each edge represents a dependency between functions.
The only functions I am interested in seeing are those exported by Darwin packages - https://darwin-eu-dev.github.io/Packages/docs/packageStatuses.html
I'd like to keep an eye on this graph as a way of understanding the relationships between packages we are developing and maintaining in Darwin. Is this something that would be in scope for PaRe? If not can you give me some ideas about how to do this based on your experience working on PaRe?
The text was updated successfully, but these errors were encountered: