Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Analyze github repos to identify R packages needing help #23

Open
kbroman opened this issue Mar 6, 2019 · 7 comments
Open

Analyze github repos to identify R packages needing help #23

kbroman opened this issue Mar 6, 2019 · 7 comments

Comments

@kbroman
Copy link

kbroman commented Mar 6, 2019

maybe we could identify R packages on github and measure something like number of open issues vs time since last commit. might point to packages of interest that need help

@jdblischak
Copy link

@kbroman Good idea! For getting started, @jimhester's package itdepends has some code for gathering relevant info like the number of open issues, etc.

https://github.com/jimhester/itdepends/blob/128cd7e42d866c3beaf8ab40ab3cf2e42392208f/R/github.R#L10

@maurolepore
Copy link

👍
Are you thinking about popular, published packages (e.g. tidyverse packages)?
If so, I propose something complementary: To provide a package-help group for chirunconf packages. I'll explain more on a separate issue, but the main idea is to help on the technical side so that folks unfamiliar with package-development tools can realize their ideas and share them as a package at the end of the unconf.

@kbroman
Copy link
Author

kbroman commented Mar 8, 2019

@maurolepore I wasn't thinking tidyverse particularly, but rather of trying to crawl github to find repositories that were interesting to people (as indicated by there being issues) but not necessarily kept up.

@maurolepore
Copy link

Maybe related to #32

@chasemc
Copy link

chasemc commented Mar 8, 2019

Great idea! I don't think you'll need to crawl, the GH API is pretty good, and: https://github.com/r-lib/gh exists

@jdblischak
Copy link

Great idea! I don't think you'll need to crawl, the GH API is pretty good, and: https://github.com/r-lib/gh exists

The Search API is more restrictive than the other parts of the API. I often get rejected "for triggering abuse mechanisms" even though I am only querying a few hundred results. Searching every R package on GitHub would take some time.

Another possibility would be to start with this curated list of GitHub R packages. Starting from this list, then we could use the GitHub API to query specific attributes about each repository.

@jimhester
Copy link

jimhester commented Mar 8, 2019

The GH API lets you search repositories by number of help-wanted-issues, which might be a way to go to find some places to help.

search <- gh::gh("/search/repositories", q = "language:r", sort = "help-wanted-issues", order = "desc")

library(purrr)
library(tibble)

map_chr(search[[3]], "full_name")
#>  [1] "UptakeOpenSource/uptasticsearch"              
#>  [2] "BioinformaticsFMRP/TCGAbiolinks"              
#>  [3] "cbeleites/hyperSpec"                          
#>  [4] "Huh/PopR_SDGFP"                               
#>  [5] "International-Soil-Radiocarbon-Database/ISRaD"
#>  [6] "TommyJones/textmineR"                         
#>  [7] "UptakeOpenSource/pkgnet"                      
#>  [8] "ropenscilabs/learngganimate"                  
#>  [9] "slowkow/ggrepel"                              
#> [10] "BonnyCI/ci-plunder"                           
#> [11] "OpenMx/OpenMx"                                
#> [12] "ProvTools/provR"                              
#> [13] "statnet/ergm"                                 
#> [14] "retrography/OrientR"                          
#> [15] "trestletech/plumber"                          
#> [16] "ices-tools-prod/fisheryO"                     
#> [17] "jackwasey/icd"                                
#> [18] "rich-iannone/DiagrammeR"                      
#> [19] "ropensci/tabulizer"                           
#> [20] "cloudyr/googleComputeEngineR"                 
#> [21] "TIBHannover/BacDiveR"                         
#> [22] "HenrikBengtsson/aroma.seq"                    
#> [23] "HenrikBengtsson/Wishlist-for-R"               
#> [24] "theclue/facebook.S4"                          
#> [25] "NKU-DSC/RTrainingMaterials"                   
#> [26] "kevinwolz/hisafer"                            
#> [27] "fabian-s/tidyfun"                             
#> [28] "vertica/DistributedR"                         
#> [29] "ropenscilabs/dataspice"                       
#> [30] "franzbischoff/tsmp"

Created on 2019-03-08 by the reprex package (v0.2.1)

Of course this depends on the repo owners using that specific tag for issues, which many do not.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants