Skip to content

Commit

Permalink
use xarray to drop info before converting to df
Browse files Browse the repository at this point in the history
  • Loading branch information
JessicaS11 committed Dec 4, 2023
1 parent 74c3a60 commit 79ecc27
Showing 1 changed file with 23 additions and 11 deletions.
34 changes: 23 additions & 11 deletions doc/source/example_notebooks/QUEST_argo_data_access.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -325,7 +325,7 @@
"source": [
"filename = 'processed_ATL03_20220419002753_04111506_006_02.h5'\n",
"\n",
"reader = ipx.Read(source=path+filename)"
"reader = ipx.Read(data_source=path+filename)"
]
},
{
Expand Down Expand Up @@ -361,19 +361,31 @@
"user_expressions": []
},
"source": [
"To make the data more easily plottable, let's convert the data into a Pandas DataFrame. Note that this method is memory-intensive for ATL03 data, so users are suggested to look at small spatial domains to prevent the notebook from crashing."
"To make the data more easily plottable, let's convert the data into a Pandas DataFrame. Note that this method is memory-intensive for ATL03 data, so users are suggested to look at small spatial domains to prevent the notebook from crashing. Here, since we only have data from one granule and ground track, we have sped up the conversion to a dataframe by first removing extra xarray dimensions we don't need for our plots. Several of the other steps completed below have analogous operations in xarray that would further reduce memory requirements and computation times."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "bc086db7-f5a1-4ba7-ba90-5b19afaf6808",
"metadata": {
"tags": []
},
"id": "50d23a8e",
"metadata": {},
"outputs": [],
"source": [
"is2_pd =(ds.squeeze()\n",
" .reset_coords()\n",
" .drop_vars([\"source_file\",\"data_start_utc\",\"data_end_utc\",\"gran_idx\"])\n",
" .to_dataframe()\n",
" )"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "01bb5a12",
"metadata": {},
"outputs": [],
"source": [
"is2_pd = ds.to_dataframe()"
"is2_pd"
]
},
{
Expand All @@ -385,9 +397,9 @@
},
"outputs": [],
"source": [
"# Rearrange the data to only include \"ocean\" photons\n",
"is2_pd = is2_pd.reset_index(level=[0,1,2])\n",
"is2_pd_ocean = is2_pd[is2_pd.index==1]\n",
"# Create a new dataframe with only \"ocean\" photons, as indicated by the \"ds_surf_type\" flag\n",
"is2_pd = is2_pd.reset_index(level=[0,1])\n",
"is2_pd_ocean = is2_pd[is2_pd.ds_surf_type==1].drop(columns=\"photon_idx\")\n",
"is2_pd_ocean"
]
},
Expand Down Expand Up @@ -446,7 +458,7 @@
"outputs": [],
"source": [
"# Drop time variables that would cause errors in explore() function\n",
"is2_gdf = is2_gdf.drop(['data_start_utc','data_end_utc','delta_time','atlas_sdp_gps_epoch'], axis=1)"
"is2_gdf = is2_gdf.drop(['delta_time','atlas_sdp_gps_epoch'], axis=1)"
]
},
{
Expand Down

0 comments on commit 79ecc27

Please sign in to comment.