-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Modify getComps and writeComps #154
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@chantelwetzel-noaa, nice work with all these improvements. All the changes look good to me. I pushed one minor update to the documentation.
By way of testing, I took the Yellowtail pacfin BDS data and ran the following commands on the cleaned pacfin data to get unexpanded comps (not sure if this is the right approach as discussed in #150)
test <- bds_clean |> dplyr::mutate(Sample_size = 1) |>
pacfintools::getComps(
Comps = "LEN",
weightid = "Sample_size"
)
The sample sizes associated with sexed and unsexed fish with the 3 input_n options look good to me with the exception of the column_with_input_n = "n_fish"
option where for some reason an additional row of 0 comps was created. I'm taking leave all afternoon so sadly can't debug this now but am happy to do so on Monday.
pacfintools::writeComps(test, comp_bins = len_bin) |> dplyr::filter(year == 2024) # default is "n_tows"
# A tibble: 2 x 44
year month fleet sex partition input_n f20 f22 f24 f26 f28 f30 f32 f34 f36 f38 f40 f42 f44 f46 f48 f50 f52 f54 f56 m20 m22 m24
<int> <fct> <dbl> <dbl> <dbl> <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 2024 7 1 0 2 7 0 0 1 0 0 0 4 2 0 0 0 0 0 1 0 0 0 0 0 0 0 0
2 2024 7 1 3 2 181 0 0 0 0 0 2 6 8 9 23 82 150 189 201 177 114 50 17 4 0 0 0
# i 16 more variables: m26 <dbl>, m28 <dbl>, m30 <dbl>, m32 <dbl>, m34 <dbl>, m36 <dbl>, m38 <dbl>, m40 <dbl>, m42 <dbl>, m44 <dbl>, m46 <dbl>, m48 <dbl>, m50 <dbl>, m52 <dbl>,
# m54 <dbl>, m56 <dbl>
pacfintools::writeComps(test, comp_bins = len_bin, column_with_input_n = "n_stewart") |> dplyr::filter(year == 2024)
# A tibble: 2 x 44
year month fleet sex partition input_n f20 f22 f24 f26 f28 f30 f32 f34 f36 f38 f40 f42 f44 f46 f48 f50 f52 f54 f56 m20 m22 m24
<int> <fct> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 2024 7 1 0 2 8.79 0 0 1 0 0 0 4 2 0 0 0 0 0 1 0 0 0 0 0 0 0 0
2 2024 7 1 3 2 695. 0 0 0 0 0 2 6 8 9 23 82 150 189 201 177 114 50 17 4 0 0 0
pacfintools::writeComps(test, comp_bins = len_bin, column_with_input_n = "n_fish") |> dplyr::filter(year == 2024)
# A tibble: 3 x 44
year month fleet sex partition input_n f20 f22 f24 f26 f28 f30 f32 f34 f36 f38 f40 f42 f44 f46 f48 f50 f52 f54 f56 m20 m22 m24
<int> <fct> <dbl> <dbl> <dbl> <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 2024 7 1 0 2 13 0 0 1 0 0 0 4 2 0 0 0 0 0 1 0 0 0 0 0 0 0 0
2 2024 7 1 3 2 2040 0 0 0 0 0 2 6 8 9 23 82 150 189 201 177 114 50 17 4 0 0 0
3 2024 7 1 3 2 1683 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
@iantaylor-NOAA Thanks for looking at this. I encountered a similar issue of getting multiple rows for a single fleet where one row was female comps. and the second was male comps when I was adding the new input sample size calculation. This was occurring when there were different sample sizes for each sex. I am unclear why this may be occurring in your example when using |
I looked this over on friday and didn't see any concerns but will look over the changes today. |
The changes looks good to me. In addition to addressing some matters in #151, the changes resolve my issue #153, which can be closed once this is merged. Id suggest adding a statement indicating output is not saved when fname is not specified. This is stated in the help, so is not technically needed, but I like having it as part of the output, which is done for get_raw_comps via check_dir in the nwfscSurvey package. If you do, consider adding this in line 178
|
@brianlangseth-NOAA Thank you for taking the time to look over these changes. I can add the requested output message. @okenk Have you had an opportunity to look over these changes? If not (no worries), can you give me a timeline for your review? I want to make sure I get this merged into the main branch, if the changes are acceptable, since I know many teams are likely currently processing their fishery data. |
@chantelwetzel-noaa The newest commit (4d32c86) works for me and now outputs the correct records. |
These changes deal with a couple of issues and other minor changes to improve functionality:
getComps()
n_stewart
.writeComps()
comp_bin
argument to be specified.fname
is NULL.column_with_input_n
arguments and add amatch.arg
call to check. The pro of this change is that it makes it clearer to users the default options for users. The con to this change is that it does not allow for a user to pass a unique input sample size calculation to this function.Given the conversation on #151, I would like both @iantaylor-NOAA and @okenk to carefully consider the change to
getComps()
to make sure this is the approach we would like to modify to. @brianlangseth-NOAA feel free to look this over as well since these changes were based on our discussion as well.