-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PullSpp.fn() takes too long #30
Comments
@Curt-Whitmire-NOAA do you know if there is a way to access a list of species names with their common and scientific name using sql or url. Currently, I download all data and find unique values, this is very costly with respect to time and I am looking for a simple way to perhaps filter upon download rather than after download or a url for an existing table. Thanks. |
@kellijohnson-NOAA, I can certainly provide a table that could be uploaded to Github. Do you only want unique fish names, or do you also want invertebrates? This would be a short term fix, and wouldn't update dynamically. For that I can work with our developer to add the taxonomy table to the DW front end. I will likely have some follow-up questions to make this table as useful as possible. |
I think that the current function will work until we can get a dynamic solution going forward. In short, I am looking for a way to link common names to scientific names and the reverse where it accounts for historical names and species complexes. |
@kellijohnson-NOAA , I will email you a CSV with the current list of "fish" in the Data Warehouse taxonomy dimensions table. Please review and let me know if it suits your needs for now. We can then discuss a better dynamic solution. |
Thanks to @Curt-Whitmire-NOAA for providing the sql call. This is much much faster than unique() around a full data pull. Includes code to make GetSpp.fn more robust when there are multiple names returned.
@Curt-Whitmire-NOAA any progress on getting the taxonomy dimensions table available as a pull rather than how we have it now with a saved csv that is NEVER updated? |
@kellijohnson-NOAA I found that the taxonomy dimension table is already exposed via the API. There are some fields that need to be fully populated (e.g., ITIS Serial #, WoRMS AphiaID) but as far as I can tell the table includes the full list of taxonomic names in for all our FRAM programs. |
@kellijohnson-NOAA here's an example API pull for all "fish" species: |
@kellijohnson-NOAA Potential filters (arguments) to consider for including in the PullSpp.fn() are:
If there's anything else you need for this enhancement, please let me know. |
Thank you @Curt-Whitmire-NOAA the link you gave me worked flawlessly using
|
@kellijohnson-NOAA glad it meets your needs! Now we should both stop working for the night ;^) |
PullSpp.fn()
extracts data for all species from a single year of the WCGBTS, which takes forever. This function acts to supply a lookup data frame for common name to scientific name for species caught within the surveys performed or used by the NWFSC. See its development here in the following pull request: 28#. Via email request, the survey team is trying to make the original lookup table within the warehouse accessible to all users, i.e., downloadable from a web link. This issue is to remind us that the code in URLtext in line 19 will need to be changed for efficiency purposes.The text was updated successfully, but these errors were encountered: