Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modify getComps and writeComps #154

Merged
merged 13 commits into from
Feb 25, 2025
Merged

Modify getComps and writeComps #154

merged 13 commits into from
Feb 25, 2025

Conversation

chantelwetzel-noaa
Copy link
Contributor

@chantelwetzel-noaa chantelwetzel-noaa commented Feb 20, 2025

These changes deal with a couple of issues and other minor changes to improve functionality:

getComps()

  1. Add a column where input sample size is calculated based on the number of trips and fish (referred to as the Stewart method) called n_stewart.
  2. Input sample size options (n_fish, n_tows, and n_stewart) are calculated separately for sexed and unsexed fish.
  3. Add a warning about the revision to sample size calculation, noting that for single sex models when there are both sexed and unsexed fish, it is recommended to modify the SEX column to be all unsexed external to the function call.

writeComps()

  1. Require comp_bin argument to be specified.
  2. Modify the function to NOT save output if fname is NULL.
  3. Add other default options for column_with_input_n arguments and add a match.arg call to check. The pro of this change is that it makes it clearer to users the default options for users. The con to this change is that it does not allow for a user to pass a unique input sample size calculation to this function.

Given the conversation on #151, I would like both @iantaylor-NOAA and @okenk to carefully consider the change to getComps() to make sure this is the approach we would like to modify to. @brianlangseth-NOAA feel free to look this over as well since these changes were based on our discussion as well.

Copy link
Contributor

@iantaylor-NOAA iantaylor-NOAA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chantelwetzel-noaa, nice work with all these improvements. All the changes look good to me. I pushed one minor update to the documentation.

By way of testing, I took the Yellowtail pacfin BDS data and ran the following commands on the cleaned pacfin data to get unexpanded comps (not sure if this is the right approach as discussed in #150)

test <- bds_clean |> dplyr::mutate(Sample_size = 1) |> 
  pacfintools::getComps(
    Comps = "LEN",
    weightid = "Sample_size"
  )

The sample sizes associated with sexed and unsexed fish with the 3 input_n options look good to me with the exception of the column_with_input_n = "n_fish" option where for some reason an additional row of 0 comps was created. I'm taking leave all afternoon so sadly can't debug this now but am happy to do so on Monday.

pacfintools::writeComps(test, comp_bins = len_bin) |> dplyr::filter(year == 2024) # default is "n_tows"
# A tibble: 2 x 44
   year month fleet   sex partition input_n   f20   f22   f24   f26   f28   f30   f32   f34   f36   f38   f40   f42   f44   f46   f48   f50   f52   f54   f56   m20   m22   m24
  <int> <fct> <dbl> <dbl>     <dbl>   <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1  2024 7         1     0         2       7     0     0     1     0     0     0     4     2     0     0     0     0     0     1     0     0     0     0     0     0     0     0     
2  2024 7         1     3         2     181     0     0     0     0     0     2     6     8     9    23    82   150   189   201   177   114    50    17     4     0     0     0     
# i 16 more variables: m26 <dbl>, m28 <dbl>, m30 <dbl>, m32 <dbl>, m34 <dbl>, m36 <dbl>, m38 <dbl>, m40 <dbl>, m42 <dbl>, m44 <dbl>, m46 <dbl>, m48 <dbl>, m50 <dbl>, m52 <dbl>,    
#   m54 <dbl>, m56 <dbl>

pacfintools::writeComps(test, comp_bins = len_bin, column_with_input_n = "n_stewart") |> dplyr::filter(year == 2024) 
# A tibble: 2 x 44
   year month fleet   sex partition input_n   f20   f22   f24   f26   f28   f30   f32   f34   f36   f38   f40   f42   f44   f46   f48   f50   f52   f54   f56   m20   m22   m24
  <int> <fct> <dbl> <dbl>     <dbl>   <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>     
1  2024 7         1     0         2    8.79     0     0     1     0     0     0     4     2     0     0     0     0     0     1     0     0     0     0     0     0     0     0     
2  2024 7         1     3         2  695.       0     0     0     0     0     2     6     8     9    23    82   150   189   201   177   114    50    17     4     0     0     0     

pacfintools::writeComps(test, comp_bins = len_bin, column_with_input_n = "n_fish") |> dplyr::filter(year == 2024) 
# A tibble: 3 x 44
   year month fleet   sex partition input_n   f20   f22   f24   f26   f28   f30   f32   f34   f36   f38   f40   f42   f44   f46   f48   f50   f52   f54   f56   m20   m22   m24
  <int> <fct> <dbl> <dbl>     <dbl>   <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>     
1  2024 7         1     0         2      13     0     0     1     0     0     0     4     2     0     0     0     0     0     1     0     0     0     0     0     0     0     0     
2  2024 7         1     3         2    2040     0     0     0     0     0     2     6     8     9    23    82   150   189   201   177   114    50    17     4     0     0     0     
3  2024 7         1     3         2    1683     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     

@chantelwetzel-noaa
Copy link
Contributor Author

@iantaylor-NOAA Thanks for looking at this. I encountered a similar issue of getting multiple rows for a single fleet where one row was female comps. and the second was male comps when I was adding the new input sample size calculation. This was occurring when there were different sample sizes for each sex. I am unclear why this may be occurring in your example when using n_fish. Since you provided the code, I can test this out for my comps to see if I can fix the n_fish calculation in getComps().

@brianlangseth-NOAA
Copy link
Contributor

I looked this over on friday and didn't see any concerns but will look over the changes today.

@brianlangseth-NOAA
Copy link
Contributor

The changes looks good to me. In addition to addressing some matters in #151, the changes resolve my issue #153, which can be closed once this is merged.

Id suggest adding a statement indicating output is not saved when fname is not specified. This is stated in the help, so is not technically needed, but I like having it as part of the output, which is done for get_raw_comps via check_dir in the nwfscSurvey package. If you do, consider adding this in line 178

if (is.null(fname) & verbose) {
    cli::cli_alert_info("Output will not be saved in fname because fname = NULL.")
  }

@chantelwetzel-noaa
Copy link
Contributor Author

@brianlangseth-NOAA Thank you for taking the time to look over these changes. I can add the requested output message.

@okenk Have you had an opportunity to look over these changes? If not (no worries), can you give me a timeline for your review? I want to make sure I get this merged into the main branch, if the changes are acceptable, since I know many teams are likely currently processing their fishery data.

@brianlangseth-NOAA
Copy link
Contributor

@chantelwetzel-noaa The newest commit (4d32c86) works for me and now outputs the correct records.

@chantelwetzel-noaa chantelwetzel-noaa merged commit 9508c09 into main Feb 25, 2025
@chantelwetzel-noaa chantelwetzel-noaa deleted the get-write-comps branch February 25, 2025 23:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants