-
Notifications
You must be signed in to change notification settings - Fork 8
Did a package ever come out of this hackathon? #13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
This did not ever make it to a standalone package. Also because the if you look through the notebooks (e.g. https://github.com/oceanhackweek/ohw21-proj-model-subsampling/blob/main/Application2LLC.ipynb), you will see that xarray's interp function does almost everything we need (cell number 128). Do you think this still requires a package of its own? If so, will the package just work for MITgcm? |
Quick answer: I have been developing it here: https://github.com/kdrushka/oceanliner (most up-to-date and functional code is in the testing folder ...) |
This looks nice, I had not realized that you were working on this. |
Thanks! Indeed, the xarray interp function does all the heavy lifting. I have mainly been working on the user inputs and outputs. It is written for netcdf files that have been converted to netcdf (while retaining original variable names) but could easily be adapted to other formats (e.g., binary files). |
Thanks @dhruvbalwada and @kdrushka! What I had in mind was something in between both of your suggestions, i.e. a very light-weight wrapper for survey = xr.Dataset(
dict(
lon = xr.DataArray(lon, dims='casts'),
lat = xr.DataArray(lat, dims='casts'),
time = xr.DataArray(time, dims='casts'),
)
)
ds['THETA'].interp(survey) that would robustly handle various use cases and provide some basic utilities. Even if this code snippet isn't complicated, it's 5 lines longer than it should be and will have to be modified for every use case! I imagine something flexible enough to handle various permutations of common arguments like: import xsample
sample(ds['THETA'], ds_survey['THETA'], inner=['XC', 'YC', 'time']) # for xr.Dataset with dimension
sample(ds['THETA, SALT'], df_survey, inner=['XC', 'YC', 'Z', 'time'], outer='point') # for pd.DataFrame
sample(ds['UVEL'], {'XG': XG, 'YC': YC}, outer='mooring') # for np.array where |
Anecdotally, I am even having trouble getting this to |
I personally did not dig into the optimal behavior that much when we were working on this. But I do remember wondering about this, especially for datasets from the LLC4320 run, but I don't remember what the chunking on it was. Maybe @kdrushka knows more since she has been working with the cutouts for SWOT cross-over more? Do you have a notebook with what you are trying to do? Maybe if you post a notebook in gist, which shows the chunking etc for your particular example, I can take a look if I get some time - or ask people around me who are much more superior at task-graphs etc. |
I like your suggestion and believe that this could be a nice solution for that. @kdrushka and @dhruvbalwada |
@hdrake , I like your suggestion - this is much simpler than what I'm doing. I am also struggling with memory - in part because I am still new to python/dask so I am mostly guessing at how to optimize the code. That said, it runs (albeit slowly) on the llc4320 cutouts I'm working with. |
I agree it could be worth putting on pip / conda-forge if it works out, but in that case it should be extremely lightweight. As few dependencies as possible, no notebooks or figures in the Repo to keep it small, etc. I'm a bit swamped with OSM this next week but will return to this in March. |
Sure! Let me know if/how I can help. |
Hi! I am writing some code to do similar things to subsampled a high-resolution MITgcm simulation and don't want to reinvent the wheel. Two questions:
The text was updated successfully, but these errors were encountered: