Modify getComps and writeComps #154

chantelwetzel-noaa · 2025-02-20T22:02:36Z

These changes deal with a couple of issues and other minor changes to improve functionality:

getComps()

Add a column where input sample size is calculated based on the number of trips and fish (referred to as the Stewart method) called n_stewart.
Input sample size options (n_fish, n_tows, and n_stewart) are calculated separately for sexed and unsexed fish.
Add a warning about the revision to sample size calculation, noting that for single sex models when there are both sexed and unsexed fish, it is recommended to modify the SEX column to be all unsexed external to the function call.

writeComps()

Require comp_bin argument to be specified.
Modify the function to NOT save output if fname is NULL.
Add other default options for column_with_input_n arguments and add a match.arg call to check. The pro of this change is that it makes it clearer to users the default options for users. The con to this change is that it does not allow for a user to pass a unique input sample size calculation to this function.

Given the conversation on #151, I would like both @iantaylor-NOAA and @okenk to carefully consider the change to getComps() to make sure this is the approach we would like to modify to. @brianlangseth-NOAA feel free to look this over as well since these changes were based on our discussion as well.

#151

…comps

iantaylor-NOAA

@chantelwetzel-noaa, nice work with all these improvements. All the changes look good to me. I pushed one minor update to the documentation.

By way of testing, I took the Yellowtail pacfin BDS data and ran the following commands on the cleaned pacfin data to get unexpanded comps (not sure if this is the right approach as discussed in #150)

test <- bds_clean |> dplyr::mutate(Sample_size = 1) |> 
  pacfintools::getComps(
    Comps = "LEN",
    weightid = "Sample_size"
  )

The sample sizes associated with sexed and unsexed fish with the 3 input_n options look good to me with the exception of the column_with_input_n = "n_fish" option where for some reason an additional row of 0 comps was created. I'm taking leave all afternoon so sadly can't debug this now but am happy to do so on Monday.

pacfintools::writeComps(test, comp_bins = len_bin) |> dplyr::filter(year == 2024) # default is "n_tows"
# A tibble: 2 x 44
   year month fleet   sex partition input_n   f20   f22   f24   f26   f28   f30   f32   f34   f36   f38   f40   f42   f44   f46   f48   f50   f52   f54   f56   m20   m22   m24
  <int> <fct> <dbl> <dbl>     <dbl>   <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1  2024 7         1     0         2       7     0     0     1     0     0     0     4     2     0     0     0     0     0     1     0     0     0     0     0     0     0     0     
2  2024 7         1     3         2     181     0     0     0     0     0     2     6     8     9    23    82   150   189   201   177   114    50    17     4     0     0     0     
# i 16 more variables: m26 <dbl>, m28 <dbl>, m30 <dbl>, m32 <dbl>, m34 <dbl>, m36 <dbl>, m38 <dbl>, m40 <dbl>, m42 <dbl>, m44 <dbl>, m46 <dbl>, m48 <dbl>, m50 <dbl>, m52 <dbl>,    
#   m54 <dbl>, m56 <dbl>

pacfintools::writeComps(test, comp_bins = len_bin, column_with_input_n = "n_stewart") |> dplyr::filter(year == 2024) 
# A tibble: 2 x 44
   year month fleet   sex partition input_n   f20   f22   f24   f26   f28   f30   f32   f34   f36   f38   f40   f42   f44   f46   f48   f50   f52   f54   f56   m20   m22   m24
  <int> <fct> <dbl> <dbl>     <dbl>   <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>     
1  2024 7         1     0         2    8.79     0     0     1     0     0     0     4     2     0     0     0     0     0     1     0     0     0     0     0     0     0     0     
2  2024 7         1     3         2  695.       0     0     0     0     0     2     6     8     9    23    82   150   189   201   177   114    50    17     4     0     0     0     

pacfintools::writeComps(test, comp_bins = len_bin, column_with_input_n = "n_fish") |> dplyr::filter(year == 2024) 
# A tibble: 3 x 44
   year month fleet   sex partition input_n   f20   f22   f24   f26   f28   f30   f32   f34   f36   f38   f40   f42   f44   f46   f48   f50   f52   f54   f56   m20   m22   m24
  <int> <fct> <dbl> <dbl>     <dbl>   <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>     
1  2024 7         1     0         2      13     0     0     1     0     0     0     4     2     0     0     0     0     0     1     0     0     0     0     0     0     0     0     
2  2024 7         1     3         2    2040     0     0     0     0     0     2     6     8     9    23    82   150   189   201   177   114    50    17     4     0     0     0     
3  2024 7         1     3         2    1683     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0

chantelwetzel-noaa · 2025-02-21T20:02:00Z

@iantaylor-NOAA Thanks for looking at this. I encountered a similar issue of getting multiple rows for a single fleet where one row was female comps. and the second was male comps when I was adding the new input sample size calculation. This was occurring when there were different sample sizes for each sex. I am unclear why this may be occurring in your example when using n_fish. Since you provided the code, I can test this out for my comps to see if I can fix the n_fish calculation in getComps().

brianlangseth-NOAA · 2025-02-24T16:02:22Z

I looked this over on friday and didn't see any concerns but will look over the changes today.

brianlangseth-NOAA · 2025-02-24T22:38:59Z

The changes looks good to me. In addition to addressing some matters in #151, the changes resolve my issue #153, which can be closed once this is merged.

Id suggest adding a statement indicating output is not saved when fname is not specified. This is stated in the help, so is not technically needed, but I like having it as part of the output, which is done for get_raw_comps via check_dir in the nwfscSurvey package. If you do, consider adding this in line 178

if (is.null(fname) & verbose) {
    cli::cli_alert_info("Output will not be saved in fname because fname = NULL.")
  }

chantelwetzel-noaa · 2025-02-24T22:44:43Z

@brianlangseth-NOAA Thank you for taking the time to look over these changes. I can add the requested output message.

@okenk Have you had an opportunity to look over these changes? If not (no worries), can you give me a timeline for your review? I want to make sure I get this merged into the main branch, if the changes are acceptable, since I know many teams are likely currently processing their fishery data.

brianlangseth-NOAA · 2025-02-25T21:14:05Z

@chantelwetzel-noaa The newest commit (4d32c86) works for me and now outputs the correct records.

chantelwetzel-noaa added 7 commits February 20, 2025 13:02

add stewart input sample size calculation

0842ccc

revise input sample size calc to be calculated for sexed and unsexed

ed97955

#151

add warning about sample size calc

ff4c7fb

document

7850540

require input for comp_bins (#153), modify file writing, document

2aecdf7

Merge remote-tracking branch 'origin/get-write-comps' into get-write-…

e622eb1

…comps

rename stewart to n_stewart for clarity

57700b8

chantelwetzel-noaa requested review from okenk and iantaylor-NOAA February 20, 2025 22:02

chantelwetzel-noaa and others added 2 commits February 20, 2025 14:09

add verbose to warning

d4bb5ee

minor updates to writeComps documentation

c18d540

iantaylor-NOAA approved these changes Feb 21, 2025

View reviewed changes

chantelwetzel-noaa added 2 commits February 21, 2025 12:15

fix: calculate a combined n_fish for M and F

60e1ec1

streamline the input n calc for sexed and unsexed fish

3a5f62b

chantelwetzel-noaa added 2 commits February 24, 2025 14:46

add message when output will not be saved

1b33176

fix: data binning that was eliminate records

4d32c86

chantelwetzel-noaa merged commit 9508c09 into main Feb 25, 2025

chantelwetzel-noaa deleted the get-write-comps branch February 25, 2025 23:47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Modify getComps and writeComps #154

Modify getComps and writeComps #154

chantelwetzel-noaa commented Feb 20, 2025 •

edited

Loading

iantaylor-NOAA left a comment

chantelwetzel-noaa commented Feb 21, 2025

brianlangseth-NOAA commented Feb 24, 2025

brianlangseth-NOAA commented Feb 24, 2025

chantelwetzel-noaa commented Feb 24, 2025

brianlangseth-NOAA commented Feb 25, 2025

Modify getComps and writeComps #154

Modify getComps and writeComps #154

Conversation

chantelwetzel-noaa commented Feb 20, 2025 • edited Loading

iantaylor-NOAA left a comment

Choose a reason for hiding this comment

chantelwetzel-noaa commented Feb 21, 2025

brianlangseth-NOAA commented Feb 24, 2025

brianlangseth-NOAA commented Feb 24, 2025

chantelwetzel-noaa commented Feb 24, 2025

brianlangseth-NOAA commented Feb 25, 2025

chantelwetzel-noaa commented Feb 20, 2025 •

edited

Loading