Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fellingdateR: Estimate, report and combine felling dates of historical tree-ring series #618

Closed
10 of 27 tasks
hanecakr opened this issue Nov 20, 2023 · 67 comments
Closed
10 of 27 tasks

Comments

@hanecakr
Copy link

hanecakr commented Nov 20, 2023

Date accepted: 2024-04-08

Submitting Author Name: Kristof Haneca
Submitting Author Github Handle: @hanecakr
Other Package Authors Github handles: (comma separated, delete if none)
Repository: https://github.com/hanecakr/fellingdateR
Version submitted:
Submission type: Standard
Editor: @maelle
Reviewers: @njtierney, @ajpelu

Archive: TBD
Version accepted: TBD
Language: en


  • Paste the full DESCRIPTION file inside a code block below:
Package: fellingdateR
Title: Estimate, report and combine felling dates of historical tree-ring 
    series
Version: 0.0.0.9003
Authors@R: c(
    person("Kristof", "Haneca", , "[email protected]",role = c("aut", "cre"),
        comment = c(ORCID = "0000-0002-7719-8305")),
    person("Koen", "Van Daele", role = "ctb",
        comment = c(ORCID = "0000-0002-8153-2978")),
    person("Ronald", "Visser", role = "ctb",
        comment = c(ORCID = "0000-0001-6966-1729"))    
    )
Description: `fellingdateR` is an R package that aims to facilitate the 
  analysis and interpretation of tree-ring data from wooden 
  cultural heritage objects and structures. The package standarizes the process
  of computing and combining felling date estimates, both for individual and 
  groups of related tree-ring series.
URL: https://github.com/hanecakr/fellingdateR
BugReports: https://github.com/hanecakr/fellingdateR/issues
License: MIT + file LICENSE
Encoding: UTF-8
Roxygen: list(markdown = TRUE)
RoxygenNote: 7.2.3
Depends: 
    R (>= 2.10)
LazyData: true
Imports: 
    dplyr,
    ggplot2,
    ggtext,
    MASS,
    plyr,
    tidyr,
    utils,
    dplR
Suggests: 
    covr,
    knitr,
    rmarkdown,
    testthat (>= 3.0.0)
Config/testthat/edition: 3
VignetteBuilder: knitr

Scope

  • Please indicate which category or categories from our package fit policies this package falls under: (Please check an appropriate box below. If you are unsure, we suggest you make a pre-submission inquiry.):

    • data retrieval
    • data extraction
    • data munging
    • data deposition
    • data validation and testing
    • workflow automation
    • version control
    • citation management and bibliometrics
    • scientific software wrappers
    • field and lab reproducibility tools
    • database software bindings
    • geospatial data
    • text analysis
  • Explain how and why the package falls under these categories (briefly, 1-2 sentences):

The package aims to facilitate and standardize the computation of felling dates from dated tree-ring series. It covers all steps of processing measured ring-width series from raw data to the reporting of felling date estimates, for single series and for groups of related tree-ring series.

  • Who is the target audience and what are scientific applications of this package?

Maily professional tree-ring scientists (dendrochronologists) that work on historical timbers and wooden objects (archaeology, architectural history, sculptures, panel paintings, etc.)

No, no other R-packages do the same job.

yes

  • If you made a pre-submission inquiry, please paste the link to the corresponding issue, forum post, or other discussion, or @tag the editor you contacted.

  • Explain reasons for any pkgcheck items which your package is unable to pass.

All checks pass

Technical checks

Confirm each of the following by checking the box.

This package:

Publication options

  • Do you intend for this package to go on CRAN?

  • Do you intend for this package to go on Bioconductor?

  • Do you wish to submit an Applications Article about your package to Methods in Ecology and Evolution? If so:

MEE Options
  • The package is novel and will be of interest to the broad readership of the journal.
  • The manuscript describing the package is no longer than 3000 words.
  • You intend to archive the code for the package in a long-term repository which meets the requirements of the journal (see MEE's Policy on Publishing Code)
  • (Scope: Do consider MEE's Aims and Scope for your manuscript. We make no guarantee that your manuscript will be within MEE scope.)
  • (Although not required, we strongly recommend having a full manuscript prepared when you submit here.)
  • (Please do not submit your package separately to Methods in Ecology and Evolution)

Code of conduct

@ropensci-review-bot
Copy link
Collaborator

Thanks for submitting to rOpenSci, our editors and @ropensci-review-bot will reply soon. Type @ropensci-review-bot help for help.

@ropensci-review-bot
Copy link
Collaborator

🚀

The following problem was found in your submission template:

  • URL = [https] is not valid
    The package could not be checked because of problems with the URL.
    Editors: Please ensure these problems are rectified, and then call @ropensci-review-bot check package.

👋

@mpadge
Copy link
Member

mpadge commented Nov 21, 2023

@hanecakr The error was because your URL field was markdown-style: "[url](url)", and should just be plain "url". I've edited to fix; you should now be able to call @ropensci-review-bot check package yourself to restart checks. Sorry for any inconvencience.

@hanecakr
Copy link
Author

@ropensci-review-bot check package

@ropensci-review-bot
Copy link
Collaborator

Thanks, about to send the query.

@ropensci-review-bot
Copy link
Collaborator

🚀

Editor check started

👋

@ropensci-review-bot
Copy link
Collaborator

Checks for fellingdateR (v0.0.0.9003)

git hash: 05c449fb

  • ✔️ Package name is available
  • ✔️ has a 'codemeta.json' file.
  • ✔️ has a 'contributing' file.
  • ✔️ uses 'roxygen2'.
  • ✔️ 'DESCRIPTION' has a URL field.
  • ✔️ 'DESCRIPTION' has a BugReports field.
  • ✔️ Package has at least one HTML vignette
  • ✔️ All functions have examples.
  • ✔️ Package has continuous integration checks.
  • ✔️ Package coverage is 77.6%.
  • ✔️ R CMD check found no errors.
  • ✔️ R CMD check found no warnings.
  • 👀 Function names are duplicated in other packages

(Checks marked with 👀 may be optionally addressed.)

Package License: MIT + file LICENSE


1. Package Dependencies

Details of Package Dependency Usage (click to open)

The table below tallies all function calls to all packages ('ncalls'), both internal (r-base + recommended, along with the package itself), and external (imported and suggested packages). 'NA' values indicate packages to which no identified calls to R functions could be found. Note that these results are generated by an automated code-tagging system which may not be entirely accurate.

type package ncalls
internal base 363
internal stats 33
internal fellingdateR 26
internal grDevices 7
internal graphics 2
imports ggplot2 32
imports utils 15
imports dplyr 6
imports ggtext 3
imports plyr 2
imports tidyr 2
imports dplR 2
imports MASS 1
suggests covr NA
suggests knitr NA
suggests rmarkdown NA
suggests testthat NA
linking_to NA NA

Click below for tallies of functions used in each package. Locations of each call within this package may be generated locally by running 's <- pkgstats::pkgstats(<path/to/repo>)', and examining the 'external_calls' table.

base

sub (32), rep (31), length (21), c (17), nrow (17), apply (15), matrix (15), max (14), ncol (13), data.frame (10), seq (10), which (10), range (9), for (8), paste0 (8), as.data.frame (7), attributes (7), numeric (7), as.numeric (6), by (6), drop (5), is.na (5), mean (5), min (5), list (4), summary (4), diff (3), floor (3), format (3), rownames (3), setdiff (3), sum (3), T (3), unique (3), any (2), character (2), get (2), grep (2), lapply (2), names (2), round (2), sapply (2), scale (2), seq_len (2), abs (1), all (1), as.character (1), as.integer (1), ceiling (1), colnames (1), colSums (1), complex (1), cumsum (1), dim (1), file.exists (1), gettext (1), Im (1), intersect (1), lengths (1), logical (1), match (1), merge (1), message (1), Re (1), readLines (1), regexpr (1), seq_along (1), sort (1), sqrt (1), substring (1), table (1), vapply (1)

stats

df (18), spline (7), end (3), sd (2), start (2), sigma (1)

ggplot2

element_blank (13), element_text (5), ggplot (3), aes (2), arrow (2), geom_line (2), unit (2), geom_area (1), scale_x_continuous (1), theme (1)

fellingdateR

hdi (7), sw_model (6), sw_data_overview (3), d.count (2), d.dens (2), fd_report (2), sw_interval (2), strip.comment (1), sw_combine_plot (1)

utils

data (13), citation (1), tail (1)

grDevices

pdf (7)

dplyr

left_join (1), mutate (1), pull (1), relocate (1), select (1), summarize (1)

ggtext

element_markdown (3)

dplR

rwl.stats (2)

graphics

title (2)

plyr

round_any (2)

tidyr

pivot_longer (2)

MASS

fitdistr (1)


2. Statistical Properties

This package features some noteworthy statistical properties which may need to be clarified by a handling editor prior to progressing.

Details of statistical properties (click to open)

The package has:

  • code in R (100% in 16 files) and
  • 1 authors
  • 1 vignette
  • 13 internal data files
  • 8 imported packages
  • 15 exported functions (median 80 lines of code)
  • 21 non-exported functions in R (median 50 lines of code)

Statistical properties of package structure as distributional percentiles in relation to all current CRAN packages
The following terminology is used:

  • loc = "Lines of Code"
  • fn = "function"
  • exp/not_exp = exported / not exported

All parameters are explained as tooltips in the locally-rendered HTML version of this report generated by the checks_to_markdown() function

The final measure (fn_call_network_size) is the total number of calls between functions (in R), or more abstract relationships between code objects in other languages. Values are flagged as "noteworthy" when they lie in the upper or lower 5th percentile.

measure value percentile noteworthy
files_R 16 74.9
files_vignettes 1 68.4
files_tests 14 93.8
loc_R 2855 89.8
loc_vignettes 193 48.1
loc_tests 480 74.1
num_vignettes 1 64.8
data_size_total 6023 68.2
data_size_median 465 60.0
n_fns_r 36 45.9
n_fns_r_exported 15 58.5
n_fns_r_not_exported 21 42.1
n_fns_per_file_r 1 20.7
num_params_per_fn 4 67.3
loc_per_fn_r 65 93.5
loc_per_fn_r_exp 80 89.2
loc_per_fn_r_not_exp 50 90.3
rel_whitespace_R 11 80.9
rel_whitespace_vignettes 68 73.3
rel_whitespace_tests 11 58.1
doclines_per_fn_exp 36 43.7
doclines_per_fn_not_exp 0 0.0 TRUE
fn_call_network_size 30 55.5

2a. Network visualisation

Click to see the interactive network visualisation of calls between objects in package


3. goodpractice and other checks

Details of goodpractice checks (click to open)

3a. Continuous Integration Badges

R-CMD-check.yaml
pkgcheck

GitHub Workflow Results

id name conclusion sha run_number date
6932735657 pkgcheck success 05c449 5 2023-11-20
6932735666 R-CMD-check success 05c449 9 2023-11-20
6932735656 test-coverage success 05c449 12 2023-11-20

3b. goodpractice results

R CMD check with rcmdcheck

R CMD check generated the following note:

  1. checking data for non-ASCII characters ... NOTE
    Note: found 11 marked UTF-8 strings

R CMD check generated the following check_fails:

  1. cyclocomp
  2. rcmdcheck_non_ascii_characters_in_data

Test coverage with covr

Package coverage: 77.6

Cyclocomplexity with cyclocomp

The following functions have cyclocomplexity >= 15:

function cyclocomplexity
read_fh 150
fd_report 42
sw_sum 32
cor_table 31
sw_combine 27
sw_interval 17
movAv 16
sw_combine_plot 15

Static code analyses with lintr

lintr found the following 170 potential issues:

message number of times
Avoid 1:length(...) expressions, use seq_len. 6
Avoid library() and require() calls in packages 1
Avoid using sapply, consider vapply instead, that's type safe 2
Lines should not be more than 80 characters. 154
Use <-, not =, for assignment. 7


4. Other Checks

Details of other checks (click to open)

✖️ The following 3 function names are duplicated in other packages:

    • get_header from eventr
    • hdi from bayestestR, ggdist, hdi, HDInterval
    • movAv from berryFunctions


Package Versions

package version
pkgstats 0.1.3.9
pkgcheck 0.1.2.11


Editor-in-Chief Instructions:

This package is in top shape and may be passed on to a handling editor

@jhollist
Copy link
Member

@hanecakr Thank you for the submission. All looks good to me. Will pass on the handling editor momentarily.

@jhollist
Copy link
Member

@ropensci-review-bot assign @maelle as editor

@ropensci-review-bot
Copy link
Collaborator

Assigned! @maelle is now the editor

@maelle
Copy link
Member

maelle commented Nov 27, 2023

Thanks for your submission @hanecakr! I have some comments before I can proceed with the reviewer search. 🌳

Happy to discuss the comments below! 😸

Editor checks:

  • Documentation: The package has sufficient documentation available online (README, pkgdown docs) to allow for an assessment of functionality and scope without installing the package. In particular,

    • Is the case for the package well made?
    • Is the reference index page clear (grouped by topic if necessary)?
    • Are vignettes readable, sufficiently detailed and not just perfunctory?
  • Fit: The package meets criteria for fit and overlap.

  • Installation instructions: Are installation instructions clear enough for human users?

  • Tests: If the package has some interactivity / HTTP / plot production etc. are the tests using state-of-the-art tooling?

  • Contributing information: Is the documentation for contribution clear enough e.g. tokens for tests, playgrounds?

  • License: The package has a CRAN or OSI accepted license.

  • Project management: Are the issue and PR trackers in a good shape, e.g. are there outstanding bugs, is it clear when feature requests are meant to be tackled?


Editor comments

Installation instructions

For brevity you can keep only the installation instructions with pak.

Dependencies

I wonder whether the dependency on plyr for very few functions could be avoided?

Metadata

I don't think it's allowed to have backticks in a DESCRIPTION file, I see some in the Description field. Or if it's allowed, at least it's unusual?

Docs

I'd recommend creating a pkgdown website.
In the vignette and README it'd make sense to remove the description of functions and datasets, instead pointing to the reference index of the pkgdown website. It'd make it easier to skim the README, and to see what functions/datasets there are at once (for reviewers but also users!).

Data

I see the example datasets are saved as internal datasets however you use them in the documentation (in the README) so why not make them exported datasets? https://r-pkgs.org/data.html#sec-data-data

Rather than using the names with "dummy" I'd recommend informative names. (Beside, some might consider "dummy" to be an offensive word related to ableism).

#### Tests

I see some top-level code in for instance https://github.com/hanecakr/fellingdateR/blob/master/tests/testthat/test-fd_report.R I'd recommend instead having a function defined in a helper file like tests/testthat/helper-testdata

test_data <- function() {
data.frame(series = c("aaa", "bbb", "ccc", "no_last", "no_sapwood"),
                       n_sapwood = c(10, 11, 12, 10, NA),
                       waneyedge = c(FALSE, FALSE, TRUE, FALSE, FALSE),
                       last = c(123, 456, 1789, NA, 1978))
}

that you'd then call from each test (not each test file) to create the data. Even better, it could have a more informative name. For more context: https://blog.r-hub.io/2020/11/18/testthat-utility-belt/#code-called-in-your-tests and https://r-pkgs.org/testing-advanced.html#sec-testing-advanced-fixture-helper
With such a strategy, each test is easier to read and to run, as it's self-contained.

I think some test lines could be more condensed, e.g. https://github.com/hanecakr/fellingdateR/blob/05c449fb13b678ac1700177d73a20fad6c55da8c/tests/testthat/test-fd_report.R#L105 could go on the previous line, and https://github.com/hanecakr/fellingdateR/blob/05c449fb13b678ac1700177d73a20fad6c55da8c/tests/testthat/test-fd_report.R#L96-L98 or https://github.com/hanecakr/fellingdateR/blob/05c449fb13b678ac1700177d73a20fad6c55da8c/tests/testthat/test-fd_report.R#L120-L122 might be a single line, to see more of the test file at once on the screen?

You shouldn't namespace fellingdateR::: data in test files.

For test-sw_combine_plot.R you might want to look into snapshot testing rather than using ggplot2's internal data structure: https://testthat.r-lib.org/articles/snapshotting.html#whole-file-snapshotting (other relevant source: https://www.tidyverse.org/blog/2022/09/playing-on-the-same-team-as-your-dependecy/#testing-testing)

Why is there a test that is commented out? https://github.com/hanecakr/fellingdateR/blob/05c449fb13b678ac1700177d73a20fad6c55da8c/tests/testthat/test-sw_interval.R#L2 (as opposed to, say, completely removed)

Instead of the regular expressions for testing errors, you might be interested in this approach of classed errors: https://www.mm218.dev/posts/2023-11-07-classed-errors/

Code

Please read the output from lintr in the automatic checks above, and fix accordingly (or update this thread to explain why you disagree).

Lines such as https://github.com/hanecakr/fellingdateR/blob/05c449fb13b678ac1700177d73a20fad6c55da8c/R/cor_table.R#L117 should use inherits()

if (!inherits(x, "rwl")) {

In for instance cor_table.R, I think the code might be more readable with explaining variables. For instance

if (!all(diff(as.numeric(row.names(x))) == 1)) {

would be

increasing_consecutive_years <- (all(diff(as.numeric(row.names(x))) == 1))
if (!increasing_consecutive_years) {

Same comments as for test files, I think there could be fewer new lines in scripts such as https://github.com/hanecakr/fellingdateR/blob/master/R/cor_table.R, to facilitate reading / vertical scrolling. I guess this conflicts a bit with the huge indentations in the files (which you are obviously allowed to prefer!).

For the else if in https://github.com/hanecakr/fellingdateR/blob/05c449fb13b678ac1700177d73a20fad6c55da8c/R/sw_interval.R#L236, why not use the switch() function? I must confess I like using it but have to look up the docs every time to remember how to input the arguments. But it'll decrease the complexity of the current logic.

In https://github.com/hanecakr/fellingdateR/blob/05c449fb13b678ac1700177d73a20fad6c55da8c/R/sw_interval_plot.R#L137 what warnings are suppressed? In the same post mentioned previously https://www.mm218.dev/posts/2023-11-07-classed-errors/, it's shown that one can selectively suppress warnings of a given class.


@hanecakr
Copy link
Author

Thanks a lot @maelle for this first round of comments. Will try to tackle most of these issues in the next few days.

@hanecakr
Copy link
Author

Dear @maelle. I've been working on the package and think I'm able now to address most of your comments.
Thanks a lot for your time, valuable insights and recommendations!

I'll list the answers to your comments here:

Installation instructions

For brevity you can keep only the installation instructions with pak.

  • [x ] Ok. I've removed the install with devtools from README

Dependencies

I wonder whether the dependency on plyr for very few functions could be avoided?

  • [x ] The function plyr::round_any is now removed from sw_interval_plot(), sw_combine_plot() and sw_sum_plot()
  • [x ] floor/ceiling(x/10)*10 now does the job

Metadata

I don't think it's allowed to have backticks in a DESCRIPTION file, I see some in the Description field. Or if it's allowed, at least it's unusual?

  • [x ] Backticks removed from DESCRIPTION

Docs

I'd recommend creating a pkgdown website.

  • [x ] Initiated a pkgdown website now. I had postponed this because I also have a short paper ready (after code-review) for JOSS. Planned to use that paper (and reviewer comments) for an article/vignette on a pkgdown site. But obviously, for reviewers a pkgdown-site is most helpful at this stage already.

Data

I see the example datasets are saved as internal datasets however you use them in the documentation (in the README) so why not make them exported datasets? https://r-pkgs.org/data.html#sec-data-data

  • [x ] The exported data are factual (published) data that are used by nearly all functions in the package. With sw_data_overview() users get an overview of what sapwood datasets are available in the package and with sw_data_info() basic descriptives of those dataset.
  • When I would make the made-up date also external, that would mix real-world data and made-up data. And would also complicate the code for sw_data_overview(). Therefore I would like to keep the made-up data for examples and tests as internal data.

Rather than using the names with "dummy" I'd recommend informative names. (Beside, some might consider "dummy" to be an offensive word related to ableism).

  • [x ] Thanks for pointing this out. Changed this throughout all functions and tests to “trs_example1” (tree-ring series example). Indeed much better than ‘dummy’ or ‘fake’.

Tests

I see some top-level code in for instance https://github.com/hanecakr/fellingdateR/blob/master/tests/testthat/test-fd_report.R I'd recommend instead having a function defined in a helper file like tests/testthat/helper-testdata
...

  • [x ] The top-level code is in fact redundant. The test can equally be run with the internal data. This has now been implemented in the tests.

I think some test lines could be more condensed, e.g. ...

  • [x ] I reformatted most test-files to make them more condensed. Should have done that before submission…

You shouldn't namespace fellingdateR::: data in test files.

  • [x ] Right. Removed the namespace to fellingdateR in tests.

For test-sw_combine_plot.R you might want to look into snapshot testing rather than using ggplot2's internal data structure: https://testthat.r-lib.org/articles/snapshotting.html#whole-file-snapshotting (other relevant source: https://www.tidyverse.org/blog/2022/09/playing-on-the-same-team-as-your-dependecy/#testing-testing)

  • I've left this as it is now. Can't find an elegant way to implement snapshot testing right now. Need to look deeper into that.

Why is there a test that is commented out? https://github.com/hanecakr/fellingdateR/blob/05c449fb13b678ac1700177d73a20fad6c55da8c/tests/testthat/test-sw_interval.R#L2 (as opposed to, say, completely removed)

  • [x ] Oops, Removed this from the test-file

Instead of the regular expressions for testing errors, you might be interested in this approach of classed errors: https://www.mm218.dev/posts/2023-11-07-classed-errors/

Code

Please read the output from lintr in the automatic checks above, and fix accordingly (or update this thread to explain why you disagree).

Lines such as https://github.com/hanecakr/fellingdateR/blob/05c449fb13b678ac1700177d73a20fad6c55da8c/R/cor_table.R#L117 should use inherits()
if (!inherits(x, "rwl")) {

  • [x ] Now using inherits() in the latest version of cor_table().

In for instance cor_table.R, I think the code might be more readable with explaining variables. For instance
if (!all(diff(as.numeric(row.names(x))) == 1)) {
would be
increasing_consecutive_years <- (all(diff(as.numeric(row.names(x))) == 1))
if (!increasing_consecutive_years) {

  • [x ] OK, I've changed this to increase readability

Same comments as for test files, I think there could be fewer new lines in scripts such as https://github.com/hanecakr/fellingdateR/blob/master/R/cor_table.R, to facilitate reading / vertical scrolling. I guess this conflicts a bit with the huge indentations in the files (which you are obviously allowed to prefer!).

  • [x ] I’ve reformatted the code with RStudio > reformat code. Is indeed (a bit) more condensed now.

For the else if in https://github.com/hanecakr/fellingdateR/blob/05c449fb13b678ac1700177d73a20fad6c55da8c/R/sw_interval.R#L236, why not use the switch() function? I must confess I like using it but have to look up the docs every time to remember how to input the arguments. But it'll decrease the complexity of the current logic.

  • I remember I tried this before, but failed to implement switch() for this function. Would prefer to leave it as it is now.

In https://github.com/hanecakr/fellingdateR/blob/05c449fb13b678ac1700177d73a20fad6c55da8c/R/sw_interval_plot.R#L137 what warnings are suppressed? In the same post mentioned previously https://www.mm218.dev/posts/2023-11-07-classed-errors/, it's shown that one can selectively suppress warnings of a given class.

  • [x ] Doesn’t seem to be necessary anymore to use suppressWarnings() to avoid messages from ggplot, for sw_model() and sw_combine(). So I’ve removed the calls to suppressWarnings() for these functions.
  • For sw_interval() however I’ve kept the call to suppressWarnings to avoid warnings as:

 Warning: Removed 52 rows containing non-finite values (stat_align()).
 Warning: Removed 43 rows containing missing values (geom_line()).

I'm sure there are more elegant and effective ways to improve the scripts, functions and their documentation, but I hope that these initial revisions have elevated the quality of the package sufficiently, making it amenable for the review process. Of course, I'm open for additional insights and recommendations to further improved the quality and performance of the package.

-- Kristof

@maelle
Copy link
Member

maelle commented Nov 30, 2023

Thanks a ton @hanecakr!

A small note, in GitHub Markdown if you type - [x] the checkbox appears as a ticked checkbox, whereas - [x ] does not.

In the README when mentioning the reference index, you could rephrase the link so as not to use "here" (https://webaccess.berkeley.edu/ask-pecan/click-here), for instance "You can find the [list of functions]".

When I would make the made-up date also external, that would mix real-world data and made-up data. And would also complicate the code for sw_data_overview(). Therefore I would like to keep the made-up data for examples and tests as internal data.

Ok, but in that case would it work to not load sysdata (which might look confusing) but instead use data(dataname, package = "fellingdateR")? Also to avoid using ::: when using the data in examples.

Should have done that before submission…

There's already a lot of stuff to do before submission!

Don't hesitate to ask if you need help with snapshot testing, but it's maybe not useful here anyway.

A small non compulsory thing, I see your default branch is named master, you could change it to the new community standard "main" using the usethis function for instance. See also https://www.tidyverse.org/blog/2021/10/renaming-default-branch/

Another Git thing, your repository contains the .Rproj.user folder, should it be removed and added to .gitignore? See https://usethis.r-lib.org/reference/git_vaccinate.html for a handy way to gitignore it globally.

I remember I tried this before, but failed to implement switch() for this function. Would prefer to leave it as it is now.

Fair enough! Happy to try and make a PR if you change your mind (maybe this time I'd remember how to use it straight away 😂 )

I'll now look for reviewers! Thanks again for all your work!

@maelle
Copy link
Member

maelle commented Nov 30, 2023

@ropensci-review-bot seeking reviewers

@ropensci-review-bot
Copy link
Collaborator

Please add this badge to the README of your package repository:

[![Status at rOpenSci Software Peer Review](https://badges.ropensci.org/618_status.svg)](https://github.com/ropensci/software-review/issues/618)

Furthermore, if your package does not have a NEWS.md file yet, please create one to capture the changes made during the review process. See https://devguide.ropensci.org/releasing.html#news

@hanecakr
Copy link
Author

@maelle Thanks a lot!
Meanwhile, I will change the branch name to main, use the data(...) function to load internal data, and have a look at the .Rproj.user folder and try to find a way to add it to .gitignore. The badge will be added to README soon.

@maelle
Copy link
Member

maelle commented Dec 1, 2023

@ropensci-review-bot add @njtierney to reviewers

@ropensci-review-bot
Copy link
Collaborator

@njtierney added to the reviewers list. Review due date is 2023-12-22. Thanks @njtierney for accepting to review! Please refer to our reviewer guide.

rOpenSci’s community is our best asset. We aim for reviews to be open, non-adversarial, and focused on improving software quality. Be respectful and kind! See our reviewers guide and code of conduct for more.

@ropensci-review-bot
Copy link
Collaborator

@njtierney: If you haven't done so, please fill this form for us to update our reviewers records.

@maelle
Copy link
Member

maelle commented Dec 1, 2023

Thank you @njtierney for accepting to review this package!

@hanecakr
Copy link
Author

hanecakr commented Dec 1, 2023

@njtierney: with a little delay, I decided to follow the advice of @maelle and moved the example datasets from internal to external data, so they become easily available for testing and examples, both for end-uses as during code review. The latest commit implements the necessary (small) changes. Looking forward to your comments and suggestions. -- Kristof.

@njtierney
Copy link
Contributor

Thanks, team! I've got a few deadlines this week but will take a look next week and aim to have this done by the 22nd.

@maelle
Copy link
Member

maelle commented Dec 5, 2023

Thank you @njtierney and good luck with the deadlines!

@maelle
Copy link
Member

maelle commented Dec 9, 2023

@ropensci-review-bot add @ajpelu to reviewers

@ropensci-review-bot
Copy link
Collaborator

@ajpelu added to the reviewers list. Review due date is 2023-12-30. Thanks @ajpelu for accepting to review! Please refer to our reviewer guide.

rOpenSci’s community is our best asset. We aim for reviews to be open, non-adversarial, and focused on improving software quality. Be respectful and kind! See our reviewers guide and code of conduct for more.

@maelle
Copy link
Member

maelle commented Feb 9, 2024

👋 @hanecakr, any update? 😸

@hanecakr
Copy link
Author

hanecakr commented Feb 9, 2024 via email

@maelle
Copy link
Member

maelle commented Feb 9, 2024

Thanks for the update @hanecakr, and good luck with the deadlines!

@hanecakr
Copy link
Author

hanecakr commented Mar 20, 2024

Again a big thank you @njtierney for your review report and time.
I was not able to address all issues raised earlier, but have now found some time to work on the package and to provide an answer to your comments and suggestions.
I've copy-pasted the review report and inserted my replies below.

Package Review

Please check off boxes as applicable, and elaborate in comments below.
Your review is not limited to these topics, as described in the reviewer
guide

  • Briefly describe any working relationship you have (had) with the
    package authors.
  • As the reviewer I confirm that there are no conflicts of
    interest
    for me to
    review this work (if you are unsure whether you are in conflict,
    please speak to your editor before starting your review).

I have added you as a reviewer in de DESCRIPTION

Documentation

The package includes all the following forms of documentation:

  • A statement of need: clearly stating problems the software
    is designed to solve and its target audience in README

The opening paragraphs of the README are good, and I think that this R
package solves a challenging problem, so firstly, well done! I think
could be made a little bit clearer in terms of the problem it solves,
and the input it takes. While I find the photos useful, it initially
made me think that this software takes images as input. I would suggest
something more like what is in the vignette to start:

fellingdateR offers a set of functions that assist in inferring
felling date estimates from dated tree-ring series.

Then, describe the problem you want to solve, which I think is
estimating when the timber was cut down. Then show the data, explain
what the columns mean, and how this might be a typical example of dated
tree-ring series data.

Then show a short example of the output, clearly demonstrating the
problem the package solves.

The rest of the first paragraph:

The presence of (partially) preserved sapwood or waney edge allows to
estimate a range for the actual felling date, for individual series as
well as for a group of timbers. Furthermore, an additional function
provides a tool to sum sapwood probability distributions, comparable
to 'summed probability densities' commonly applied to sets of
radiocarbon (14C) dates.

Is important, but I think could go into more of a methods/general
introduction part of the README, perhaps further down.

I'm not sure what the images show me, and so to communicate this
effectively I think they should contain a caption.

I think the target audience could be more clearly stated in the README.
Perhaps at the end of the first paragraph.

README has been rewritten according to comments of both reviews.

The 'Get started' vignette provides more detail and examples.

  • Installation instructions: for the development version of
    package and any non-standard dependencies in README

All installed well for me!

  • Vignette(s): demonstrating major functionality that runs
    successfully locally

It did run successfully locally! T and F should be specified as
TRUE and FALSE.

Now TRUE and FALSE are used consistently

  • Function Documentation: for all exported functions
  • Examples: (that run successfully locally) for all exported
    functions

The examples ran without error, using:

devtools::run_examples()
  • Community guidelines: including contribution guidelines in
    the README or CONTRIBUTING, and DESCRIPTION with URL, BugReports
    and Maintainer (which may be autogenerated via Authors@R).

There are no community guidelines in the README, I see them in the file:
.github/CONTRIBUTING.md, but these are not linked to in the README.
Once these are linked, e.g., by writing something like:

## Code of Conduct

Please note that the visdat project is released with a [Contributor Code of Conduct](https://github.com/hanecakr/fellingdateR/blob/main/.github/CONTRIBUTING.md). By contributing to this project, you agree to abide by its terms.

Community guidelines and code of conduct have been added

Functionality

  • Installation: Installation succeeds as documented.
  • Functionality: Any functional claims of the software been
    confirmed.
  • Performance: Any performance claims of the software been
    confirmed.
  • Automated tests: Unit tests cover essential functions of the
    package and a reasonable range of inputs and conditions. All tests
    pass on the local machine.

All tests pass - unit tests seem quite good coverage, evaluated using
devtools::test_coverage().

  • Packaging guidelines: The package conforms to the rOpenSci
    packaging guidelines.

  • package name passes checks on available::available("fellingdateR")

  • I think if possible the author should consider renaming the package
    to all lowercase, fellingdater or fellingdatr.

Not sure what is the best way to do this. Any practical
guidelines? UPDATE: I was a bit afraid to initiate a Git inferno, but apparently it's not that difficult.
So the package name had been changed to fellingdater.

  • There are other considerations that I think mean it does not
    currently conform to the rOpenSci packaging guidelines. Rather than
    discuss them in too much depth here, I will put them in the review
    section below.

Estimated hours spent reviewing: 5

  • Should the author(s) deem it appropriate, I agree to be
    acknowledged as a package reviewer ("rev" role) in the package
    DESCRIPTION file.

You have been added as reviewer in the DESCRIPTION


Review Comments

I wanted to open by saying that while I have a lot of feedback, I think
that this is a great piece of software that helps solve a tough problem,
so well done on the author for writing this! I hope that the feedback is
useful 😄 . Please let me know if something is not clear or if you need
help implementing these, or further information. Thank you for
submitting this software, I enjoyed reviewing it.

General comments

There are a fair few examples from the rOpenSci packaging
guide
, which I don't think
are followed, I have gone through the guide and written some examples
here. After the author makes these changes, I would recommend they
double check the guide.

  • Recommend making sure all functions and objects use snake_case.

All code is in snake_case now

Except for the `read_fh()` function. In the fellingdateR
package I build upon the originale code of the read.fh function from
the dplR package. I would prefer to stay a close as possible to the
original code in the dplR package in order to facilitate future
cooperation and possible integration of both functions.

  • argument name uses x for most data frame inputs. I would recommend
    considering naming data things data or .data or similar to help
    distinguish them from a vector, x. Not required but worth
    considering, I think.

  • There is some use of cat in the package, recommend using cli as
    described in the tidyverse style guide on writing error
    messages
    . I expand
    on this below.

cat() no longer used (except for read_fh - see comment above)

  • Code style is not consistent, there is mixed use of the number of
    indentations : between 0 and 8 spaces. I would recommend applying
    the tidyverse style guide to the package with styler::style_pkg()

Code has been restyled using the styler-package

  • Indenting code is important but this 8 space indentation is not
    consistent with other indentation used in your package, and when
    reading the code gives the impression that the code is happening
    inside some/several if/else/for control statements. I would
    recommend applying a style guide such as the tidyverse style guide
    or similar, to the code, so that indentation is consistent.

  • Due to indentation, a lot of lines of code go over 80 characters. I
    think it is worth the time to re-indent, or rewrite some code by
    using explaining variables, so the code doesn't go over 80
    characters

Code has been restyled using the styler-package

  • = is sometimes used over <- - I recommend using <-
    consistently.

<- is now used consistently as assignment operator

  • There is no top level documentation for ?fellingdateR - this could
    be achieved using usethis::use_package_doc().

This has been added

see https://hanecakr.github.io/fellingdater

  • internal functions, like d.dens and d.count should have a
    #' @noRd tag to mark is as an internal function

These function now resided in helper-functions.R with #'
@nord

  • examples in code should use all argument parameters

Most examples now include all arguments

  • Recommend the author reads through the CRAN
    gotchas

  • What does the sw stand for in things like sw_combine and co?

sw = [s]{.underline}ap[w]{.underline}ood, fd =
[f]{.underline}elling [d]{.underline}ate. I've made this more
obvious in the README introduction.

  • Some of the documentation uses reversed backticks, which I haven't
    seen before, e.g.: ´n_sapwood´ and ´count´

corrected

  • There are still a few lines of code that don't pass the
    goodpractice::gp() checks. In particular, I think these comments
    are important:
  ✖ write short and simple
    functions. These functions
    have high cyclomatic
    complexity (>50): read_fh
    (150). You can make them
    easier to reason about by
    encapsulating distinct steps
    of your function into
    subfunctions.
  ✖ use '<-' for
    assignment instead of '='.
    '<-' is the standard, and R
    users and developers are used
    it and it is easier to read
    your code for them if you use
    '<-'.
  ✖ avoid long code lines,
    it is bad for readability.
    Also, many people prefer
    editor windows that are about
    80 characters wide. Try make
    your lines shorter than 80
    characters
  ✖ avoid sapply(), it is
    not type safe. It might return
    a vector, or a list, depending
    on the input data. Consider
    using vapply() instead.
  ✖ avoid 1:length(...),
    1:nrow(...), 1:ncol(...),
    1:NROW(...) and 1:NCOL(...)
    expressions. They are error
    prone and result 1:0 if the
    expression on the right hand
    side is zero. Use seq_len() or
    seq_along() instead.
  ✖ avoid 'T' and 'F', as
    they are just variables which
    are set to the logicals 'TRUE'
    and 'FALSE' by default, but
    are not reserved words and
    hence can be overwritten by
    the user.  Hence, one should
    always use 'TRUE' and 'FALSE'
    for the logicals.

all codes has been styled with the styler-package, <- , TRUE
and FALSE now used consistently, and length of some functions
reduced by implementing some helper-functions, e.g. for checking
input.
Use of sapply and 1:length() has been avoided.

  • You interchange between using = and <- in your code. I would
    recommend using <- only. See for example in cor_table.R:

= no longer used as assignment operator

if (is.null(y)) {
     y = x
     noRef = TRUE
}
else {
     noRef = FALSE
     y_ori <- y
}
  • Error messages. I would recommend building input checking functions
    to assist in how your write up error messages. There are a few key
    benefits to this:

    • The input checking function then does not get in the way of
      understanding the intent of your function code.
    • You can reuse the input checking functions, so you don't need to
      write them again.
    • using cli to build the error messages allows you to use glue
      strings, so you don't have to try and quote or inject other
      information into the message string, it should be easier to add
      details you care about.

check_input() is now one of the helper functions in
helper-functions.R.

  • Error messages are hard to write well, and it's great that you've
    included some good input checking! I think you could make the error
    functions a bit better for the user by following the tidyverse
    style guide on error
    messages
    .

  • explaining variables. I've mention this a few times in the other
    functions in your package, I think it would be worthwhile searching
    through your cases of using if and if there is a long conditional
    in there, e.g.,

any(pdf_matrix[, 2:length(keycodes) + 1] == 1, na.rm = TRUE))

  • Then I think it would be worthwhile either writing a small wrapper
    function to identify this, or wrap that up in an explaining
    variable.

  • plot = TRUE as a function option.

    • I believe plotting functions should be separate to statistical
      transformation/operations. You have written these in ggplot,
      and so you can specify an autoplot method or a separate
      plot_<function> command.
    • The user should be able to reconstruct the plot from the data
      that you give them in these functions. E.g., they should be able
      to get the key information out, such as for sw_interval, the
      following information should be given from the function: n,
      hdi, and the number of sapwood rings.
    • It is not clear to me how to get this information, and I think
      that this is really important that the user doesn't end up
      locked into a plot to get their vital statistics. If they want
      to be able to generate tables or other statistics, then they
      cannot do this programmatically, they would have to physically
      eyeball a plot and record down the numbers, like `hdi (95.4%) =
      between 8 and 26 sapwood rings". Which is prone to errors.

All numerical information needed to build the plots can be found
in the output of the sw_model(), sw_interval(), sw_combine() and
sw_sum() functions. Their plot argument defaults to plot = FALSE. So
the output of e.g sw_combine(trs_example1) can be used as the input
for sw_combine_plot()

```{r}
tmp <- sw_combine(trs_example1) 
tmp
sw_combine_plot(tmp)
```
  • consistent file names. Some of the files have camelCase names
    (movAv.R), others are snake_case. I would recommend sticking to a
    consistent naming scheme, snake_case.

all snake_case now

  • I would try and avoid having else statements contain
    errors/stops/warnings/messages. This is because in order to
    understand the message at the end, you need to then walk back up
    through the condition of logic beforehand. The way to avoid this is
    to clearly state the error condition at the top.

Defensive programming has been implemented now, avoiding the use
of `else` statements followed by a stop/error-message.

Input checking

I would recommend writing small helpers for input checking, and
considering using cli to help write error messages, as it means you
could transform this:

if (!inherits(x, "rwl"))  {
     warning("'x' is not class rwl")
}
if (!inherits(y, "rwl"))  {
     warning("'y' is not class rwl")
}

Into:

warn_if_not_rwl(x)
warn_if_not_rwl(y)

And that code could look like this:

warn_if_not_rwl <- function(x,
                            arg = rlang::caller_arg(x),
                            call = rlang::caller_env()){
     cli::cli_warn(
          c("{arg}' is not of class {.cls rwl}")
     )
}

Similarly,

increasing_consecutive_years <- all(diff(as.numeric(row.names(x))) == 1)
if (!increasing_consecutive_years) {
     stop(
          "The tree-ring series 'x' have/has no consecutive years in increasing order as rownames."
     )
}

Could be written as a function:

check_if_increasing_consecutive_years(x)
check_if_increasing_consecutive_years(y)

Admittedly, I do have a strong preference for writing these types of
functions, having written about it
recently
,
but I do think that at least using explaining variables, which you've
already done in places like:

increasing_consecutive_years <- all(diff(as.numeric(row.names(x))) == 1)

Are a great idea, and there are a few notable places where that would
help make the code a bit easier to read, e.g.,

any(
length(min_overlap) != 1 |
!is.numeric(min_overlap) |
min_overlap %% 1 != 0 |
min_overlap < 3
)
  • check_input() is now one of the helper functions in
    helper-functions.R

  • smaller checks for input values are now available as a
    helper-function.

  • Most examples you give above is from the read_fh() function. see
    my previous motivation why I would like to stay close to the
    original dplR::read.fh() code.

cor_table.R

Refactoring values argument of cor_table. There is a lot of input
checking for the values argument. I think that things such as :

if ("glk" %in%  values) {

And so on indicate to me that these could be written up as separate
functions, which could return a list of their inputs, perhaps. These
could then be delivered using switch, which I often forget how to use,
but it would be something like:

values_output <- switch(values,
       "glk" = values_glk(inputs),
       "pearson" = values_pearson(inputs))

Examples should demonstrate all types of the inputs for the function
arguments.

parameter `values` was removed from the function. Looking back,
this is not an option that would be used frequently., and is certainly
not required. Removing it from the function allows to shorten the code a
bit, and avoids a lot of the necessary checks.

data.R

I would recommend standardising the dataset names to be all lowercase,
so that they are easier to remember. E.g., Sohar_2012_FWE_c becomes:
sohar_2012_fwe_c

The datasets include names of authors. The names of the datasets can
be easily copied from sw_data_overview()

fd_report.R

I think that fd_report could be renamed felling_report or
felling_date_report or similar. While fd is concise, I think it
doesn't help facilitate discoverability of the functions.

Similar to cor_table.R, I think that:

if (!series %in% names(df)) {
      stop("--> 'series' does not exist")
}
if (!last %in% names(df)) {
      stop("--> 'last' does not exist")
}
if (!n_sapwood %in% names(df)) {
      stop("--> 'n_sapwood' does not exist")
}
if (!waneyedge %in% names(df)) {
      stop("--> 'waneyedge' does not exist")
}

Could be rewritten as check_if_variable_exists(). Something like:

check_if_variable_exists <- function(x,
                                     df,
                                     arg = rlang::caller_arg(x),
                                     call = rlang::caller_env()){
     arg_in_data <- x %in% names(df)
     if (!arg_in_data) {
          cli::cli_abort(
               c("{.arg {arg}} does not exist")
          )
     }
}

example_checker <- function(x, 
                            series = "series", 
                            last = "last"){
     check_if_variable_exists(series, x)
     check_if_variable_exists(last, x)
}

example_checker(mtcars, 
                series = "wrong")

## Error in `check_if_variable_exists()`:
## ! `series` does not exist

The check_input function is now part of helper-functions.R

get_header.R

This function should move the cat message up the top - and should not
use cat, instead using one of the cli functions, like cli_abort.

I think you could use structure instead of setting attributes to NULL:

attr(rwl, "row.names") <- NULL
attr(rwl, "po") <- NULL
attr(rwl, "class") <- NULL
attr(rwl, "names") <- NULL

## becomes

rwl <- structure(
          rwl,
          row.names = NULL,
          po = NULL,
          class = NULL,
          names = NULL
     )

Although I think that they are functionally the same, so feel free to
ignore!

cat() no longer used

hdi

This function uses = and <- - suggest sticking to just <-

=no longer used, in favour of <-

movAv

I think this starting chunk would be clearer if only if and not else
is used.

The stop error can move to the top of this, so we clearly capture if
align is not "center" or "right" or "left". This makes it easier to
understand the conditions of error.

if (align == "center") {
     before <- floor((w - 1) / 2)
     after  <- ceiling((w - 1) / 2)
} else if (align == "right") {
     before <- w - 1
     after  <- 0
} else if (align == "left") {
     before <- 0
     after  <- w - 1
} else {
     stop("'align' should be 'center', 'left' or 'right'")
}

I suggest using another explaining variable inside mean:

mean(x[max(0, (i - before)):(i + after)], na.rm = TRUE)

## to something like:

earliest_to_latest <- x[max(0, (i - before)):(i + after)]
mean(earliest_to_latest, na.rm = TRUE)

## or given that this is repeated later
## potentially write this up as a function for reuse?
mean_earliest_latest(x, i, before, after)

As that mean statement is a bit involved to unfurl.

Similarly, the pattern, if (edges == "fill") { and
} else if (edges == "nofill") { should be bundled up into a function
and applied with switch

Checks for edges and fill are now on top of the script. Else
statements have been avoided.

read_fh.R

  • Nice work in the attribution to the other previous work this
    extends. It looks like this is borrowed from dplR directly, and as
    such there are small style changes. I think it is worthwhile
    updating the code style to fit within your package.
  • Be consistent with naming variables, header.taken should be
    header_taken etc.
  • There are a few random comments that I'm not sure need to be there:
        # NEW: verbose = TRUE, header = FALSE
        inp <- readLines(fname, ok = TRUE, warn = FALSE)
        # NEW: removes empty lines in .fh file
        inp <- inp[nchar(inp) != 0]
        ## Get start and end positions of headers and data blocks
        header.begin <- grep("^HEADER:$", inp)
        # NEW: Quadro => chrono
        # NEW: Double => half chrono
lengths <- numeric(n) # commit Ronald Visser

I have found that moving comments either into documentation or into
issues to help track them is helpful, but I appreciate that sometimes it
is best to leave them in the code, but just something that might be
worth thinking about :)

Tidying up the error messages in this function would make some of these
nested if/else clauses easier to understand.

This is a pretty massive function, a bit over 1200 lines of code. I
would recommend breaking down the steps inside this into smaller
functions, as this will make the code easier to reason with and maintain
in the future.

In the fellingdateR package I build upon the original code of the
read.fh function from the dplR package. I would prefer to stay a close
as possible to the original code in the dplR package in order to
facilitate future cooperation and possible integration of both
functions.

I removed all unnecessary comments as they were highlighting sections
where I've made changes to the original code.

dplR::read.fh() concentrates on extracting the measurement data. The
fellingdateR::read_fh() function extracts also the descriptive
(meta-)data from the HEADER fields in a .fh file. This is not possible
with the dplR::read.fh function.

Furthermore the fellingdateR::read_fh function allows to read data in
CHRON or HALF-CHRONO format.

read.fh() also throws errors when header fields include Capital
letters (depends on the software used to produce the .fh files: TSAP,
PAST, ...). read_fh() is case-insensitive

sw_combin_plot.R

This is the first time I've seen ############ comment blocks - I'm all
for stylistic choices but I am not sure this is needed, especially if
this isn't used in other functions.

comment blocks with #### removed

I've not seen this pattern to avoid R CMD Check notes before

   # to avoid notes in CMD check
   year <-
      p <-
      lower <-
      upper <- COMB <- last <- n_sapwood <- A_i <- agreement <- NULL

My tactic has always been to have a separate definition of these, as
answered by Carson Sievert on the posit community
paage
.
I don't think there's anything inherently wrong with that, but I could
imagine that in some cases this could accidentally erase inputs.
Something to be aware of, perhaps?

When I run devtools::check() I get

❯ checking R code for possible problems ... NOTE  
x: no visible binding for global variable ‘p’ 

assigning NULL to these variables avoids the notes., as described in R
Packages (2e)
https://r-pkgs.org/package-within.html#delta-a-failed-attempt-at-making-a-package

I am all for using the new base R pipe |> - however you need to update
your Depends in your DESCRIPTION like so in order to use it, since it
only came out in R 4.1.0:

Depends: 
    R (>= 4.1.0)

Thanks! Corrected.

This comment should probably live in a github issue or just be removed:

      # NEXT LINE TRIGGERS WARNING
      # Warning message:
      # Using one column matrices in `filter()` was deprecated in dplyr 1.1.0.
      # ℹ Please use one dimensional logical vectors instead.
      # ℹ The deprecated feature was likely used in the fellingdateR package.
      # Please report the issue to the authors.
      # { if (nrow(summary |> dplyr::filter(agreement == "poor")) != 0)
      # replaced by:

these comments are removed

sw_combine.R

This error should check each of the conditions separately - either it
has missing values, or it is not numeric.

if (any(is.na(endDate)) | !is.numeric(endDate)) {
     stop(
          "--> Please check the column with 'end dates'.
Some values are possibly missing or the values are not numeric"
     )
}

A check_input() function (in helper-functions.R) now takes care of
the input

sw_data_info.R

I think these error messages would benefit from using cli, as
discussed above.

sw_data_overview.R

This is a nice function to include to facilitate data discovery

sw_interval_plot.R

This code

if (all(
     !(attributes(x)$names) %in% c(
          "year",
          "n_sapwood",
          "p")
))
     stop("Input differs from output sw_interval()")

Could be rewritten as an error function or the condition in if could
be expressed as a function.

sw_interval.R

In the final line of documentation for this function there is a hanging
sentence:

#' @return Depends on the value of `hdi`.
#'
#'  * If `hdi = TRUE`, a `numeric vector` reporting the upper and lower limit
#'   of the hdi (attributes provide more detail on `credMass` and the applied
#'   sapwood model (`sw_data`)).
#'  * If `hdi = FALSE`, a `matrix` with scaled p values for each number of
#'   observed sapwood rings. This matrix

Well spotted! Corrected.

sw_model.R

Great to see input checking at the top of the function - I do think
these should be rewritten as check input functions.

Helper function d.count I think should be put into a separate R file
called utils.R or helpers.R

d.count should use switch pattern and pass functions rather than
using if controls.

d.count should be d_count

check_input() and d_dens() (instead of d.count) are now part of
helper-functions.R

sw_sum_plot.R

indentation in this code is not consistent - recommend applying a style
guide.

Examples should show different variations possible for function
arguments. E.g., bar_col, spline_col, dot_col, and dot_size
should all be specified in the examples so the user can see what the
input should/could be.

examples have been updated with more visibility for the different
parameters.

sw_sum.R

See note above on including plots.

tests

  • Do not need to namespace testthat calls, e.g., remove testthat::
  • consider using snapshot testing, to capture exact values and shape
    of data that should be stable - rather than always testing for data
    shape and type columns, this should be able to capture those outputs
  • consider snapshot error testing to capture exact error messages
  • consider using vdiffr for testing ggplot plots. See
    visdat
    for examples

_Originally posted by @njtierney in
#618 (comment)

@hanecakr
Copy link
Author

hanecakr commented Mar 20, 2024

Dear @ajpelu. You comments and suggestions in your review report have been most helpfull to improve the package. Thank you for your time investment!
I've copy-pasted the review report and inserted my replies below.

Package Review

Please check off boxes as applicable, and elaborate in comments
below. Your review is not limited to these topics, as described in the
reviewer guide

  • Briefly describe any working relationship you have (had) with
    the package authors.
  • As the reviewer I confirm that there are no conflicts of
    interest
    for me
    to review this work (if you are unsure whether you are in
    conflict, please speak to your editor before starting your
    review).

Documentation

The package includes all the following forms of documentation:

  • A statement of need: clearly stating problems the software
    is designed to solve and its target audience in README

I think the target audience of the pkg would be indicated more
explicitly.

The images are not very illustrative of what the package does, they
simply seem to indicate that they are wooden tokens. Perhaps it would
be interesting to think of a schematic or similar, but the photos do
not seem to provide much.

  • Installation instructions: for the development version of
    package and any non-standard dependencies in README

In the installation instructions, it might be more effective to
provide a link to the GitHub repository of the package rather than
displaying the raw GitHub link itself.

Why does the author prefer to use the package pak rather than devtools
to install the development version of the pkg? I see that in the
vignette he indicated the two ways (pak and devtools).

I serve at a government agency and mostly work from behind a
firewall. Installing packages with devtool::install_packages() often
fails (timeout) but miraculously works fine with pak::pak(). I have no
idea why and it is probably related to the settings of the firewall,
but therefore I prefer pak over devtools now.

  • Vignette(s): demonstrating major functionality that runs
    successfully locally

The vignette outlines the primary functions but lacks a
straightforward workflow for beginners.

There are missing citations in the vignette. Please add a reference
list at the end.

references added as url's to the publication

In the vignette of the website it appears:

sw_sum(fellingdateR:::trs_example7, plot = TRUE)

but when I run locally i got:

sw_sum(fellingdateR:::trs_example7, plot = TRUE)
Error in as.data.frame(x) : object 'trs_example7' not found

Please consider change the three ::: by ::

redundant use of ::: has been removed from the examples in the
vignette

  • Function Documentation: for all exported functions
  • Examples: (that run successfully locally) for all exported
    functions
  • Community guidelines: including contribution guidelines in
    the README or CONTRIBUTING, and DESCRIPTION with URL,
    BugReports and Maintainer (which may be autogenerated via
    Authors@R).

No community guidelines found, only in DESCRIPTION there is a
BugReports linked.

Guidelines have been added to th README

Functionality

  • Installation: Installation succeeds as documented.
  • Functionality: Any functional claims of the software have
    been confirmed.
  • Performance: Any performance claims of the software have
    been confirmed.
  • Automated tests: Unit tests cover essential functions of
    the package and a reasonable range of inputs and conditions. All
    tests pass on the local machine.

All test pass. The coverage is 77.44 % (using
devtools::test_coverage())

  • Packaging guidelines: The package conforms to the rOpenSci
    packaging guidelines.

    • Please consider to use lowercase name as indicated by 1.1.1
      point of the packaging guidelines
    snake_case and lowercase naming have been implemented
    consistently
 -   Not all the functions follow the object_verb() naming scheme.
     For instance, the functions read_fh and get_header. I would
     suggest to author rename them to fh_read and fh_header
    Changed get_header to fh_header, but to highlight the close
    similarity to the original read.fh() function in the dplR
    package, I would like to keep read_fh as the function name.
 -   The movAv function's name deviates from the consistent use of
     snake_case observed in the rest of the functions. To maintain
     uniformity, it would be preferable to rename it in accordance
     with the snake_case as indicated in the packaging guidelines
    Changed to mov_av()
 -   In the get_header, movAv, read_fh functions please consider
     change cat() or print() for message or warning, as packaging
     guidelines indicated.
    cat() no longer used in any of the functions

Estimated hours spent reviewing: 13

  • Should the author(s) deem it appropriate, I agree to be
    acknowledged as a package reviewer ("rev" role) in the package
    DESCRIPTION file***.***

I've included you as a reviewer in the DESCRIPTON

Review Comments

Thank you for your contribution with the fellingdateR package. It
provides a valuable suite of functions for estimating, reporting, and
combining felling dates of historical tree-ring series. The
comprehensive set of tools makes it easier to infer felling date
estimates, considering factors like preserved sapwood or waney edge.

Given my expertise and interest in dendrochronology, I thoroughly
enjoyed reviewing this package. The functions enable users to estimate
felling date ranges for both individual series and groups of timbers,
offering a valuable feature for summing sapwood probability
distributions.

This being my first software review for Ropensci, I apologize for any
potential errors or unclear comments. While I've provided various
comments (a lot of them), feel free to focus on those that are most
relevant or useful. I commend you for effectively addressing a
challenging problem with this package. If you need clarification or
assistance implementing the suggestions, please reach out. Thank you
for submitting this software; the review process was a pleasure!

General comments

  • The data description lacks a consistent pattern. The author
    occasionally employs country abbreviations while at other times
    refrains from doing so. Let's strive for coherence in this regard.
This is because some of de provided sapwood datasets do not
cover a specific country, but rather a region, or are composed of
observations of timber from e.g. Norway but the timbers themselves
have been found in the Netherlands (van Daalen_Norway).*** ***So I
named the sapwood datasets according to their authors (and
publication date if applicable), and a few letters to describe the
region/country they cover or method used.

The names of the data sets can be easily copied from the output
of `sw-data_overview()`
  • Some code and documentation overpass the 80 characters
    (particularly data.R). Please consider adjust to this recommended
    length.

All code has been styled using the stylerpackage , and should
now adhere to to the Tidyverse styleguide.

  • The README file does not contain any information about how to
    cite. I found that one in the CITATION file and also in the
    website of the package. According to Ropensci packaging guidelines
    the README need to include information about how to cite.
Citation guidelines have been added tot the README
  • Please consider generate a theme for all plots of your package in
    a utils.R function
`theme_fdr()` is now part of `helper-functions.R` and
implemented in all plotting functions.
  • I found very useful the /paper/paper.md, and I think much of this
    information would improve also the documentation and the website
    of the package. For instance, the fellingdateR_workflow is very
    useful to understand how the package works. Would the author
    consider include it in the REAMDE or in the DESCRIPTION of the
    package instead of the current pictures? I think the former is
    more informative than the latter

I've inserted the workflow in the README. The paper.md file is
a first draft for a paper to be submitted to JOSS, after software
review..

  • Why is there no auxiliary function to sw_model (eg. sw_model_plot)
    like the rest of the functions in the package (eg. sw_interval and
    sw_interval_plot)? This could lighten the function size and also
    would be coherent with the rest of the package.
There is now a new function named sw_model_plot() as
suggested.
  • The code style isn't consistent. Please consider use the styler
    package.
I've used the styler package to reformat all code.
  • I didn't find top-level documentation:
I've added a `fellingdateR-package.R` file with top-level
documentation
?fellingdateR
#No documentation for ‘fellingdateR’ in specified packages and libraries:
#you could try ‘??fellingdateR’```
Pleas consider generate ?fellingdateR or `?fellingdateR-package`. See  https://devguide.ropensci.org/building.html#general 
  • The package has a well-structured website, however I thin much of
    the information included in the paper.md is not in the website.
    For instance the background and Statement of need within the
    paper.md is very useful and I think it would help to the potential
    users of the pacakge. Please consider include in the README or in
    an specific vignette.

The README has been rewritten, including some parts of the
paper.md.

  • The vignette "getting started" includes some citations (eg.
    Hollstein 1980) but not a full reference for this citations.
    Please include them at the end of the vignette.
Thanks! Bibliographic references are included as url's/doi's
now.
  • Please be consistent with the use of = or <-, but avoid mixed
    them. For instance in hdi.R

Now there is a consistent use of <- as an assignment
operator

  • spell check: DESCRIPTION does not contain 'Language' field.
    Defaulting to 'en-US'. Please consider to specify.
en-GB is now set as 'Language'
  • Is there code duplication in the package that should be
    reduced? Yes for instance the ggthemes within the plot functions
`theme_fdr()` is now part of `helper-functions.R` and
implemented in all plotting functions

Coverage

Please consider review the test coverage of the movAv.R function
(~57.14%)

test coverage increased for mov_av()

Goodpractices

After run the goodpractices::gp() several code lines does not pass
the check:

✖ write short and simple functions. These functions have high
cyclomatic complexity (>50): read_fh (150). You can make them easier
to reason about by encapsulating distinct steps of your function into
subfunctions.

This is particularly true for the read_fh function. Since the .fh
file structure is quite flexible and combines both meta-data and
measurement data into one format, the code is quite lengthy.
Furthermore, in the fellingdateR package I build upon the original
code of the read.fh function from the dplR package. I would prefer to
stay a close as possible to the original code in the dplR package in
order to facilitate future cooperation and possible integration of
both functions

✖ use '<-' for assignment instead of '='. '<-' is the standard, and
R users and developers are used it and it is easier to read your code
for them if you use '<-'.

Now there is a consistent use of <- as an assignment operator

R/cor_table.R:108:18
R/cor_table.R:109:22
R/cor_table.R:112:22
R/hdi.R:48:12
R/hdi.R:49:14
... and 2 more lines

✖ avoid long code lines, it is bad for readability. Also, many people
prefer editor windows that are about 80 characters wide. Try make your
lines shorter than 80 characters

styler package was used to reformat all code

data-raw/DATASETS.Rmd:17:81
data-raw/DATASETS.Rmd:19:81
data-raw/DATASETS.Rmd:35:81
data-raw/DATASETS.Rmd:51:81
data-raw/DATASETS.Rmd:68:81
... and 153 more lines

✖ avoid sapply(), it is not type safe. It might return a vector, or a
list, depending on the input data. Consider using vapply() instead.

R/cor_table.R:192:20
R/cor_table.R:195:20

sapply() no longer used

✖ avoid 1:length(...), 1:nrow(...), 1:ncol(...), 1:NROW(...) and
1:NCOL(...) expressions. They are error prone and result 1:0 if the
expression on the right hand side is zero. Use seq_len() or
seq_along() instead.

R/fd_report.R:162:19
R/hdi.R:54:16
R/sw_combine.R:123:27
R/sw_combine.R:180:27
R/sw_combine.R:361:35
... and 1 more lines

seq_len() is now used instead of 1:length(...)

✖ fix this R CMD check NOTE: Note: found 11 marked UTF-8 strings

✖ avoid 'T' and 'F', as they are just variables which are set to the
logicals 'TRUE' and 'FALSE' by default, but are not reserved words and
hence can be overwritten by the user. Hence, one should always use
'TRUE' and 'FALSE' for the logicals.

TRUE/FALSE now used consistently

R/sw_combine_plot.R:NA:NA
R/sw_combine_plot.R:NA:NA
R/sw_combine_plot.R:NA:NA
R/sw_combine_plot.R:NA:NA
R/sw_combine_plot.R:NA:NA
... and 12 more lines

read_fh.R

  • Whats new funtionalities include read_fh not covered by
    dplR::read.fh()?
dplR::read.fh() concentrates on extracting the measurement
data. The fellingdateR::read_fh() function extracts also the
descriptive (meta-)data from the HEADER fields in a .fh file. This
is not possible with the dplR::read.fh function.

Furthermore the fellingdateR::read_fh function allows to read
data in CHRON or HALF-CHRONO format.

read.fh() also throws errors when header fields include Capital
letters (depends on the software used to produce the .fh files:
TSAP, PAST, ...). read_fh() is case-insensitive.
  • There are several code commented that could be removed. eg.
    # NEW: verbose = TRUE, header = FALSE Please revise them
Indeed. The comments highlight what sections have been added
compared to the original dplR::read.fh function. I've removed the
\`NEW:\` comments in the current version of read_fh()
  • The nomenclature of the objetcs exhibits a mix of snake_case and
    dot.case. Let's ensure consistency in the naming convention.
In order to stay as close as possible to the original
dplR::read.fh function (and to facilitate possible future
integration of both functions) I haven't changed any of the
variable names in the code. For all other functions in the package
snake_case was implemented.
  • The function appears to be extensive. Has the author explored the
    possibility of breaking it down into several auxiliary functions
    to enhance readability and maintainability?
See reply on bullet point 3

Perfect! I've added the link in the description

get_header.R

  • Please consider include some defensive programming such us:
if (!is.data.frame(rwl) || !inherits(rwl, "rwl")) {
    stop("Input should be a data.frame of class 'rwl'")
  }
  • avoid cat as indicated in packaging guidelines
The function now starts with a check of the input and no longer
requires cat (or equivalent).

The get_header function was renamed to `fh_header`.

cor_table.R

  • Very interesting and useful function. Has the author contemplated
    breaking down this function into more modular components? For
    instance, each of the correlation values could potentially be
    extracted as auxiliary functions, providing users with the
    flexibility to call them individually as needed.
Certainly considered, but this would require a full rewrite of
the current function. Then it should also include the possibility
to use the separate correlation values for crossdating a series on
all positions along a reference series. For this package however,
the goal is to check the date assigned to the tree-ring series and
to find the chronology that matches best in order to decide which
sapwood model to use (depending on timber provenance). So I would
like to keep the function as it is, and have the breakdown into
modular components for each correlation value on the todo-list for
a future project/package.
  • The nomenclature of arguments exhibits a mix of snake_case and
    dot.case. Let's ensure consistency in the naming convention***.***
now all snake_case
  • Remove code comments in the argument section, eg***.***
Removed
...
output = "table",
#c("matrix", "table")
... 
  • In the documentation for the argument sort_by, it would be
    useful to add the choices to avoid errors. For instance,
#' @param sort_by The correlation value by which the output is sorted for each
#'   series in `x`. One of "r_pearson", "t_St", "glk", "glk_p", "t_BP", "t_Ho". Default to "t_Ho"

Added to the documentation

  • Ensure uniformity in assignment by using the same operator
    consistently---avoid mixing <- and =
`=` no longer used as assignment operator
y <- y[, apply(!is.na(y), 2, sum) > 3, drop = FALSE]

instead of

y <-
               y[, sapply(y, function(col)
                    sum(!is.na(col))) > 3, drop = FALSE]

sapply() no longer used in this, or other functions

sw_model.R

  • Ensure uniformity in assignment by using the same operator
    consistently---avoid mixing <- and =. eg.
    n_obs = sum(observed$count)
`=` no longer used as assignment operator
  • I'm not sure if incorporating the reading of a csv is optimal in a
    function. What if a user wants to input a dataframe that isn't
    from a CSV file? I propose modifying the function to accept a
    dataframe or tibble with mandatory n_sapwood and count
    columns. Additionally, allowing users to import from a CSV could
    be achieved by either creating a small auxiliary function or
    simply utilizing utils::read.csv(). This also reduce the size of
    the sw_model function. Please consider this for all those
    functions with the same issue.
You are right! I might have over complicated the sw-functions
with this. So I removed this option from this and all other
sw_functions. Now a user-defined custom data set -
a`data.frame`*** ***with columns `n_sapwood` and `count` - can be
supplied to the function.
  • I think it is better to have a separate function for plotting,
    which could be called after the computation is completed by
    sw_model, for instance, sw_model_plot. This also allow to user
    modify the plot generated. For instance if the user doesn`t like
    the grid within the plot. In the current form to do that the user
    need to type:
s <- sw_model("Hollstein_1980", plot  = TRUE)
s + theme()panel.grid = element_blank()

sw_model_plot() was added to the package

So if there is an specific function to plot the data, this function
would include the ... in the arguments, to allow include other
ggplot2 arguments.

  • Anycase, in the plot section would be better to use .data, for
    instance, in
ggplot2::geom_segment(
  data = hdi_model,
  ggplot2::aes(
    x = hdi_model[[1]],
    xend = hdi_model[[2]],
    y = 0,
    yend = 0
  ),
  colour = "grey30",
  linewidth = .8,
  alpha = 0.8
)

I suggest to use .data[[1]]

ggplot2::geom_segment(
  data = hdi_model,
  ggplot2::aes(
    x = .data[[1]],
    xend = .data[[2]],
    y = 0,
    yend = 0
  ),
  colour = "grey30",
  linewidth = .8,
  alpha = 0.8
)

no longer needed in the new function sw_model_plot()

  • In the documentation of sw_model, the values of the densfun
    parameter are not in italics and are enclosed in double quotation
    marks. It doesn't matter, but please maintain consistency with the
    rest of the documentation. For example, in sw_interval.R, they are
    in italics.
densfun documentation in italic now

sw_interval.R

  • Please remove code commented

      # Add calendar years to output when y is provided
      # if (last == 0) {
      #         attr(hdi_int, "credMass") <- credMass
      #         attr(hdi_int, "sapwood_data") <- sw_data
      #         attr(hdi_int, "model") <- densfun
      #    }
      # if (last != 0) {
    
comments have been removed
  • In this function, the separation of the plot part from the
    computation one is appreciated. A further step would be to remove
    the plotting option completely from the function, and specified in
    the documentation (and also in the vignette) to use the
    sw_interval_plot function to obtain the plot.
plot argument is now set to plot = FALSE as default. The
numeric output of the function can then be supplied to the
plotting functions to create the visualization.

sw_interval_plot.R

  • I would suggest to add the ellipsis argument (i.e. ...) to allow
    another ggplot2 arguments. See also
    ellipsis pkg for more info.
    Consider apply this throughout the plot functions of the package.

  • Which notes are the author trying to avoid with?

    When I run devtools::check() I get

    ❯ checking R code for possible problems ... NOTE  
    x: no visible binding for global variable ‘p’ 
    
assigning NULL to these variables avoids the notes., as
described in R Packages (2e)
<https://r-pkgs.org/package-within.html#delta-a-failed-attempt-at-making-a-package>
# to avoid notes in CMD check
        p.x <- upper <- year <- NULL

¿any alternative to solve them?

  • Please, consider change this code
        if (all(
                !(attributes(x)$names) %in% c(
                        "year",
                        "n_sapwood",
                        "p")
        ))
        stop("Input differs from output sw_interval()")

by:

if (!all(c("year", "n_sapwood", "p") %in% names(attributes(x)))) {
  stop("Input structure differs from the expected output of sw_interval()")
}

OK. I've replaced this section in the code

  • You mixed snake_case with dot.case in the naming of objects. I
    would suggest to choose one and be consistent.
should_be_consistent_now
  • What this code snippet do? If I understand correctly, the
    sw_interval_plot requires the output of sw_interval, so is a .csv
    format an output of the sw_interval?
The following was deleted because the option to use a .csv file
as input was removed from all functions.
        if (grepl("\\.csv$", sw_data.p))
                sw_data.p <- basename(sw_data.p)
  • Please specify the meaning of the 'p' on the y-axis.
p = probability. documented now in \@return

sw_combine.R

  • This function requires a dataframe, as indicated in the
    documentation. However, there is no check for that; instead, the
    first line forces x to be a data.frame. I would suggest to add a
    defensive programming line such as:
   if (!is.data.frame(x)) {
    stop("Input 'x' must be a dataframe.")
  }

More defensive programming is now implemented.
check_input() is now a helper function (helper-functions.R).

  • Camel case and snake_case styles are mixed in the function. Please
    be consistent.
all snake_case now
  • There are some duplications in the function, for instance
    pdf_matrix[, 1] <- timeAxis
now only mentioned once
  • Please revise the comments within the function. Are they all
    necessary? Example:
      # when multiple exact felling dates are listed that do no correspond
      # --> COMB = 0 and after scaling NaN (division by 0)
      # check rowwise if there is any p-value == 1,
      # and replace COMB at that position with 1

Most comments have been removed

  • Perhaps this function could be modularized into several simpler
    functions. Initially, a function could evaluate whether the
    dataset contains all series with preserved sapwood or if any
    series has an exact felling date, among other conditions (as
    indicated in the examples of the function). Subsequently, based on
    the initial evaluation, the function could call auxiliary
    functions.
A check_input() function (in helper-functions.R) now takes care
of the input
  • In any case, the function should provide detailed information
    about the approaches used for each case.
This is explained in the Get started vignette.
  • Please, see my comment about the data input in csv format for
    the sw_model.R
Input from .csv file no longer supported.
  • The elements of the output list generated by this function are not
    described. Please add to the documentation.
Added to the documentation (\@return)

sw_combine_plot.R

  • Please consider taking out the rescale function out of the
    sw_combine_plot function.
`rescale()`is now part of helper-functions.R
  • A lot of mixing T, F with TRUE and FALSE. Please be
    consistent***.***
only TRUE and FALSE are used now
  • Camel case and snake_case styles are mixed in the function. Please
    be consistent..
all snake_case
  • I would suggest to allow the user customize the color used. See my
    comment in the below section (Plot functions).
now possible to choose colors for fill and line.
  • Remove the commented codes inside the function:
comments removed
      # NEXT LINE TRIGGERS WARNING
      # Warning message:
      # Using one column matrices in `filter()` was deprecated in dplyr 1.1.0.
      # ℹ Please use one dimensional logical vectors instead.
      # ℹ The deprecated feature was likely used in the fellingdateR package.
      # Please report the issue to the authors.
      # { if (nrow(summary |> dplyr::filter(agreement == "poor")) != 0)
      # replaced by:
  • The function lacks an axis legend. I presume that the x-axis
    corresponds to calendar time, and the y-axis corresponds to 'p'.
    Is my understanding correct?
Correct. Label added to X-axis.
  • Any way to indicate the "A_i" as real subscript? maybe parse =
    TRUE within geom_text?
Finally found out how to do this ;-) Done.
  • Please add information in the documentation about the agregation
    index
Added to the documentation
  • In the second example:
combo <- fellingdateR::sw_combine(trs_example2)
fellingdateR::sw_combine_plot(combo)

I removed ::from the example.

  • What do the arrow and the dot refer? Please include in the
    documentation.

sw_sum.R

  • In the documentation there is a mistake. In the description of the
    last argument please add a blank space in xcontaining
Thanks! Corrected
  • In the documentation the values of the densfun parameter are not
    in italics and are enclosed in double quotation marks. It doesn't
    matter, but please maintain consistency with the rest of the
    documentation. For example, in sw_interval.R, they are in italics.
italic and consistent now
  • Regarding defensive programming idem comment that for sw_combine.
    This function requires a dataframe, as indicated in the
    documentation. However, there is no check for that; instead, the
    first line forces x to be a data.frame. I would suggest to add a
    defensive programming line such as:
defensive programming with helper function check_input()
implemented
   if (!is.data.frame(x)) {
    stop("Input 'x' must be a dataframe.")
  }

sw_sum_plot.R

  • Good point here using customizable colors. Let's implement this
    feature consistently across other plot functions for a cohesive
    and customizable user experience.
Done for sw_model_plot, sw_interval_plot and sw_sum
  • Can a user change the width of the moving average window used
    within the sw_sum_plot()? It seems that the sw_sum_plot() uses
    always the same.
Extra parameter added to tweak the window width of the
smoother
  • Idem with the splines used.

See previous bullet

  • What do the blue dots, blue bars and red line mean? Please
    indicate in the documentation
The dots represent exact felling dates (with waney edge).
Probability of 1 assigned to 1 specific year.

Added to documentation

Other comments

  • I am surprised that different functions (for instance sw_combine
    and sw_sum), that accessing similar subdata within a data set, use
    different approaches:
cambium <- df[[waneyedge]] 

and

cambium <- x[, waneyedge]

Please be consistent along the package.

Now same approach in all functions.

  • Please review the naming conventions, including CamelCase,
    snake_case, and dot.case, used for objects throughout the package.
    It is recommended to choose a consistent style, and I would
    suggest using snake_case. Ensure uniformity across the package for
    improved clarity and maintainability***.***
Done. All snake_case now.
  • I'm not sure if incorporating the reading of a csv is optimal in a
    function. What if a user wants to input a dataframe that isn't
    from a CSV file? I propose modifying the function to accept a
    dataframe or tibble with mandatory n_sapwood and count
    columns. Additionally, allowing users to import from a CSV could
    be achieved by either creating a small auxiliary function or
    simply utilizing utils::read.csv(). This also reduce the size of
    the sw_model function***.***
This option has been removed.
  • In the different plot functions, the authors uses a common theme.
    It might be useful to create an auxiliary function with a
    ggplot2::theme() specification. Then, each plot function could
    used this auxiliary function by default or also allow to the user
    to specify a custom theme.
`theme_fdr()` is now part of `helper-functions.R` and
implemented in all plotting functions.
  • I would suggest changing the specified colors in each plot
    function to custom arguments, allowing users to choose their
    preferred colors. For instance, in th sw_interval_plot the colors
    of the area specified by
                ggplot2::geom_area(
                        ggplot2::aes(x = ifelse(year >= lower, year, NA),
                                     y = p.x),
                        fill = "tomato3",
                        color = "tomato3",
                        alpha = 0.3,
                        linewidth = 1

could be indicated as arguments, e.g. area_fill and area_color.

Done for sw_sum_plot, sw_interval_plot and sw_model.plot

Code style

All code/functions/files have been styled according to the tidyverse
styleguide using styler::style_file(…) to avoid inconsistencies in
indentation and to avoid code goes over 80 characters

Now <- is used consistently throughout the package (some code in the previous version included = for assignment.`

TRUE instead of T

README

The package description in the README.md was revised according to the
comments of both reviewers.

The original illustration was removed, and replaced by an image of a
piece of historical timber with sapwood. The question mark highlights
what this package basically does: estimate the missing number of sapwood
rings.

Now README.md has a final paragraph with a link to the Contributor
Code of Conduct

## Comments and contributions

cor_table

parameter `values` was removed from the function. Looking back,
this is not an option that would be used frequently., and is certainly
not required. Removing it from the function allows to shorten the code a
bit.

@hanecakr
Copy link
Author

It was suggested to rename the package to fellingdater instead of fellingdateR. Any tips on how to do this without breaking links and initializing git issues? @maelle @ajpelu @njtierney

@hanecakr
Copy link
Author

I have a first draft of a paper ["paper/paper.md"] that I want to submit to JOSS if/when I manage to address all issues raised in the review proces. Can this paper remain a .md file under the paper-folder, should I remove it from the package until after review by JOSS, or should I already reformat it to a vignette?

@maelle
Copy link
Member

maelle commented Mar 21, 2024

@hanecakr thank you! In your answer, it is hard for me to see what is quoted from the reviews, and what is your answer. Could you please consistently use quote formatting > for the parts that come from reviewers please? Thank you!

@maelle
Copy link
Member

maelle commented Mar 21, 2024

@hanecakr Regarding renaming, you're in luck, @njtierney wrote a post on this very topic years ago: https://www.njtierney.com/post/2017/10/27/change-pkg-name/ (linked from the dev guide)

@ajpelu
Copy link

ajpelu commented Mar 21, 2024

Hi @hanecakr Thanks for your message. I have seen your changes, and for me it is fine. If you need help to improve the pkg, please let me know. We may want to collaborate on future versions of the package, as this topic (i.e. combining dendrochronology and R) is in line with my research interests. Also if you need any help with the manuscript, let me know.

@maelle Thanks for the opportunity to review this pkg. It was an inspiring experience.

@hanecakr
Copy link
Author

hanecakr commented Mar 21, 2024

@hanecakr thank you! In your answer, it is hard for me to see what is quoted from the reviews, and what is your answer. Could you please consistently use quote formatting > for the parts that come from reviewers please? Thank you!

@maelle: All my comments were in bold-italic, but changed that to >

@hanecakr
Copy link
Author

hanecakr commented Mar 21, 2024

@hanecakr Regarding renaming, you're in luck, @njtierney wrote a post on this very topic years ago: https://www.njtierney.com/post/2017/10/27/change-pkg-name/ (linked from the dev guide)

@maelle: Thanks for the link to @njtierney 's blog. Meanwhile I also got some help from @koenedaele and the package/repository has been renamed to fellingdater.

@maelle
Copy link
Member

maelle commented Mar 21, 2024

@hanecakr thank you! I'd have recommended the opposite (quotes using the quote syntax, your comments as is) but whatever works!

@ajpelu thank you! Could you please post your answer with the reviewer's approval template?

@njtierney could you please have a look at the answer to your review?

Thanks all for all your work 🙏

@ajpelu
Copy link

ajpelu commented Mar 21, 2024

Reviewer Response

@hanecakr Thanks for your message. I have seen your changes, and for me it is fine. If you need help to improve the pkg, please let me know. We may want to collaborate on future versions of the package, as this topic (i.e. combining dendrochronology and R) is in line with my research interests. Also if you need any help with the manuscript, let me know.

@maelle Thanks for the opportunity to review this pkg. It was an inspiring experience.

Final approval (post-review)

  • The author has responded to my review and made changes to my satisfaction. I recommend approving this package.

Estimated hours spent reviewing: 13 h

@maelle
Copy link
Member

maelle commented Mar 28, 2024

@njtierney could you please have a look at the answer to your review? Please use the reviewer's approval template.

@njtierney
Copy link
Contributor

Reviewer Response

Thanks for the responses, @hanecakr! You have done a great job addressing these comments. Thank you so much for taking the time to implement these. Well done! I've got a couple of minor comments below but I think this is all good to go from my perspective.

Formatting these reviews can be a bit tricky, but I did find it hard to know which parts you were responding to - in particular the documentation parts. This could have been resolved by placing my entire review in block quotes and then you responding either in further nested block quotes (> >) or by responding outside the blockquote.

README

The README looks to be updated substantially, thank you for taking the time to do this! One minor quirk I noticed with the README is:

The presence of partially preserved sapwood (sw) allows to estimate the missing number of sapwood rings ( ? in figure below), and to report an interval in which the actual felling date (fd) likely falls.

I'm not sure what to make of the ? in figure below - is that a typo?

I also really appreciate the new workflow diagram, that is very nice!

package rename

Well done on renaming the package! One small quip - you still have the RStudio project named fellingDateR - I think this should be fellingdater.

Using <-

There is one case in a test: test-sw_model.R where = is used.

Final approval (post-review)

  • The author has responded to my review and made changes to my satisfaction. I recommend approving this package.

Estimated hours spent reviewing: 6.5

@maelle
Copy link
Member

maelle commented Apr 8, 2024

@njtierney thanks!!

@maelle
Copy link
Member

maelle commented Apr 8, 2024

@ropensci-review-bot approve fellingdater

@ropensci-review-bot
Copy link
Collaborator

Approved! Thanks @hanecakr for submitting and @njtierney, @ajpelu for your reviews! 😁

To-dos:

  • Transfer the repo to rOpenSci's "ropensci" GitHub organization under "Settings" in your repo. I have invited you to a team that should allow you to do so. You will need to enable two-factor authentication for your GitHub account.
    This invitation will expire after one week. If it happens write a comment @ropensci-review-bot invite me to ropensci/<package-name> which will re-send an invitation.
  • After transfer write a comment @ropensci-review-bot finalize transfer of <package-name> where <package-name> is the repo/package name. This will give you admin access back.
  • Fix all links to the GitHub repo to point to the repo under the ropensci organization.
  • Delete your current code of conduct file if you had one since rOpenSci's default one will apply, see https://devguide.ropensci.org/collaboration.html#coc-file
  • If you already had a pkgdown website and are ok relying only on rOpenSci central docs building and branding,
    • deactivate the automatic deployment you might have set up
    • remove styling tweaks from your pkgdown config but keep that config file
    • replace the whole current pkgdown website with a redirecting page
    • replace your package docs URL with https://docs.ropensci.org/package_name
    • In addition, in your DESCRIPTION file, include the docs link in the URL field alongside the link to the GitHub repository, e.g.: URL: https://docs.ropensci.org/foobar, https://github.com/ropensci/foobar
  • Skim the docs of the pkgdown automatic deployment, in particular if your website needs MathJax.
  • Fix any links in badges for CI and coverage to point to the new repository URL.
  • Increment the package version to reflect the changes you made during review. In NEWS.md, add a heading for the new version and one bullet for each user-facing change, and each developer-facing change that you think is relevant.
  • We're starting to roll out software metadata files to all rOpenSci packages via the Codemeta initiative, see https://docs.ropensci.org/codemetar/ for how to include it in your package, after installing the package - should be easy as running codemetar::write_codemeta() in the root of your package.
  • You can add this installation method to your package README install.packages("<package-name>", repos = "https://ropensci.r-universe.dev") thanks to R-universe.

Should you want to acknowledge your reviewers in your package DESCRIPTION, you can do so by making them "rev"-type contributors in the Authors@R field (with their consent).

Welcome aboard! We'd love to host a post about your package - either a short introduction to it with an example for a technical audience or a longer post with some narrative about its development or something you learned, and an example of its use for a broader readership. If you are interested, consult the blog guide, and tag @ropensci/blog-editors in your reply. They will get in touch about timing and can answer any questions.

We maintain an online book with our best practice and tips, this chapter starts the 3d section that's about guidance for after onboarding (with advice on releases, package marketing, GitHub grooming); the guide also feature CRAN gotchas. Please tell us what could be improved.

Last but not least, you can volunteer as a reviewer via filling a short form.

@maelle
Copy link
Member

maelle commented Apr 8, 2024

@hanecakr thanks so much for your work on your package!

@ajpelu @njtierney thanks a ton again for reviewing!

@hanecakr
Copy link
Author

hanecakr commented Apr 9, 2024

Dear @maelle , @ajpelu and @njtierney thank you so much for all your time and effort you´ve invested in reviewing my package 🙏🙏🙏. It was a first for me and learned a lot. Thanks to you the quality of the package improved significantly!
I´m on a short family trip right now (without laptop 😅), but will address the remaining issues and transfer to rOpenSci´s repo next week.
My best wishes,
Kristof

@hanecakr
Copy link
Author

@ropensci-review-bot finalize transfer of fellingdater

@ropensci-review-bot
Copy link
Collaborator

Transfer completed.
The fellingdater team is now owner of the repository and the author has been invited to the team

@hanecakr
Copy link
Author

@maelle, the transfer to rOpenSci's repo succeeded! 🚀
Shall I add you as a reviewer as well in DESCRIPTION?
Thanks again for all your time and guidance.

@maelle
Copy link
Member

maelle commented Apr 16, 2024

Awesome, great to read! No need to list me, your participation in the process is thanks enough. Thank you for all your work and your kind words!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants