Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

.IS and _interdaily_stability possibly giving different results with resampling #145

Open
hughesan opened this issue Dec 13, 2023 · 0 comments

Comments

@hughesan
Copy link

hughesan commented Dec 13, 2023

Hi - I have a question about resampling frequency in interdaily stability. The TL;DR version is that I am getting different results using freq="1H" versus using the _interdaily_stability function in the source code but only grouping by hour.

The long version of the story is that I have patient data that has a lot of spots we would like to mask - so many that I found it daunting to create a mask file for each. I'm much more comfortable with R than Python so I decided to use R to read in each actigraphy file and mask time according to our determined criteria (remove an entire day if >= 6 h of epochs are NaN). A few patients had no days that needed to be removed, so I decided to see if I could compute IS from scratch in R on one of these patients and match the results I get in pyActigraphy with .IS(binarize=False). Note, these files still have some epochs that are NaN, but my understanding is that missing data is omitted from calculation in .IS and I use var(., na.rm=T) in R to ensure the same.

In R, when actigraphy data are grouped by hour and minute (minute epochs), and the variance of the time group-means is divided by the overall sample variance, my value for IS matches exactly what I find reading the original file in to pyActigraphy and running .IS(binarize=False,freq=”1min”).

However, varying the resampling frequency gave me some unexpected results. In pyActigraphy I used .IS(binarize=False,freq=”1H”) to get what I believe is hour-grouped IS. To compute in R, the data were grouped by hour and variance of the hourly means was divided by the overall sample variance. These unexpectedly give quite different values; the values were roughly 0.34 (pyAct IS) vs 0.11 (R); for reference, the hour/minute IS value that matched in both places was also ~0.11. However, if I use the _interdaily_stability function in the source (https://github.com/ghammad/pyActigraphy/blob/master/pyActigraphy/metrics/metrics.py, line 56) with minute/second grouping omitted, I get the same result as with hour-grouping in R (0.11).

def _interdaily_stability(data): d_24h = data.groupby([ data.index.hour,] #data.index.minute, #data.index.second] ).mean().var() d_1h = data.var() return (d_24h / d_1h)
I read the pandas documentation for resampling (https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html) and it seems like computing IS from scratch with data grouped by the chosen resample frequency should match setting a resampling frequency in .IS(), but this is not what I am finding.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant