-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve handling of datetime objects #84
Comments
Various ways of creating arrays of
The function Note that these functions are not strictly equivalent, in that they return slightly different object types:
NB: A word of caution is however needed. Calculating the difference between to values in the arrays above will yield quite different results. The difference between two...
As of now, my recommendation is that we implement either of the pandas-based functions, i.e. from datetime import datetime, timedelta
import numpy as np
import pandas as pd
def pd_to_datetime(timearray, dtg_ref=None):
"""
pandas.DatetimeIndex -- utilizing pandas strenghts for efficiency
To convert the DatetimeIndex to a numpy array (of pd.Timestamp objects):
>>> dtindex = pd_to_datetime(timearray, dtg_ref)
>>> arr = dtindex.values
"""
if dtg_ref is None:
dtg_ref = pd.Timestamp.now()
elif isinstance(dtg_ref, (datetime, np.datetime64)):
dtg_ref = pd.Timestamp(dtg_ref)
# generate datetime index (strictly: Timestamp index) in two steps:
# 1) generate datetime index by passing list of floats, without specifying a reference (start) time
# 2) shift the datetime index to start at the right time
# NB: defaut unit for pd.to_datetime() is nanoseconds (ns), and the command below is _much_ faster than specifying
# the unit explicitly (e.g. pd.to_datetime(timearray, unit='s'))
dtg_time = pd.to_datetime(np.asarray(timearray) * 1e9)
dtg_time += dtg_ref - dtg_time[0]
return dtg_time
def pd_datetime_array(timearray, dtg_ref=None):
"""
numpy array of datetime.datetime objects -- generated by use of pandas' efficiency.
"""
dtg_time = pd_to_datetime(timearray, dtg_ref=dtg_ref)
return dtg_time.to_pydatetime() PS. It's easy (and quite efficient) to go back to a arr = pd_datetime_array(timearray, dtg_ref)
index = pd.to_datetime(arr) |
Is your feature request related to a problem? Please describe.
The handling of datetime objects is extremely slow, especially with large number of objects.
Describe the solution you'd like
It should be more efficient so that converting a large array/list of datetime objects to floats (e.g seconds) should be much faster.
Describe alternatives you've considered
Use
pandas.DateTimeIndex
instead ofdatetime.datetime
.The text was updated successfully, but these errors were encountered: