This example predicts future ETH price, using simple input data (just historical ETH price) and a simple model (linear dynamical model).
Predictions are 5m, 10m, ..., 60m into the future.
We assume you've already done main5.md "Setup".
In the Python console:
import ccxt
cex_x = ccxt.binance().fetch_ohlcv('ETH/USDT', '5m')
allcex_uts = [xi[0]/1000 for xi in cex_x] # timestamps
allcex_vals = [xi[4] for xi in cex_x] # ETH prices
# Extracts dates and ether price values
print_datetime_info("CEX data info", allcex_uts)
# Transform timestamps to dates
dts = to_datetimes(allcex_uts)
# create a Data Frame with two columns [date,eth-prices] with dates given in intervals of 1-hour
import pandas as pd
data = pd.DataFrame({"ds": dts, "y": allcex_vals})
In the same Python console:
# use the last 12 time periods of 5mins each of testing set, all the previous data is used as training
train_data = data.iloc[0:-12,:]
test_data = data.iloc[-12:,:]
# fit a linear model (Open sourced Facebook's Prophet model: https://facebook.github.io/prophet/)
# As the data is subdaily, the model will fit daily seasonality
from prophet import Prophet
model = Prophet()
model.fit(train_data)
In the same Python console:
#Predict ETH values over the range of the test set
forecast = model.predict(pd.DataFrame({"ds":test_data.ds}))
pred_vals_test = forecast.set_index('ds')['yhat'][-12:].to_numpy()
In the same Python console:
# now, we have predicted and actual values. Let's find error, and plot!
cex_vals = test_data.y
nmse = calc_nmse(cex_vals, pred_vals_test)
print(f"NMSE = {nmse}")
plot_prices(cex_vals, pred_vals_test)
In the same Python console:
# Do the imports
import itertools
import numpy as np
import logging
from prophet.diagnostics import (
cross_validation, performance_metrics
)
# Suppress spam debug and info logs from cmdstanpy
logger = logging.getLogger('cmdstanpy')
logger.addHandler(logging.NullHandler())
logger.propagate = False
logger.setLevel(logging.CRITICAL)
If you are wondering what these parameters are and/or what cross-validation mean, here you can find a brief introduction to what Prophet does under the hood.
# Set parameters for doing cross-validation
horizon = "1 hours"
initial = "1 days"
period = "1 hours"
# Generate grid for hyperparameters
param_grid = {
"changepoint_prior_scale": [0.001, 0.01, 0.1, 0.5],
"seasonality_prior_scale": [0.01, 0.1, 1.0, 10.0],
"changepoint_range": [0.8, 0.85, 0.95],
}
all_params = [
dict(zip(param_grid.keys(), v)) for v in itertools.product(*param_grid.values())
]
# Empty list that will contain RMSEs for each combination of hyperparams
rmses = []
# Iterate over hyperparameters and save results to compare
for params in all_params:
m = Prophet(**params).fit(train_data) # Fit model with given params
df_cv = cross_validation(m, initial=initial, horizon=horizon, period=period, parallel="processes")
df_p = performance_metrics(df_cv, rolling_window=1)
rmses.append(df_p['rmse'].values[0])
# Find the best parameters
tuning_results = pd.DataFrame(all_params)
tuning_results['rmse'] = rmses
print(tuning_results) # It will give you an idea of the results
# Extract best set of hyperparameters and print them
best_params = all_params[np.argmin(rmses)]
print(best_params)
Now that we have found the best hyperparameters, we should run again 3.1 to 3.3 to compute the NSME with the new hyparameters.
model = Prophet(**best_params)
model.fit(train_data)
forecast = model.predict(pd.DataFrame({"ds":test_data.ds}))
pred_vals_test = forecast.set_index('ds')['yhat'][-12:].to_numpy()
cex_vals = test_data.y
nmse = calc_nmse(cex_vals, pred_vals_test)
print(f"NMSE = {nmse}")
Doing this exercise the NMSE goes from 0.00066
to 0.00033
having a relative improvement of 50%.
In the same Python console:
# fit model with all the available data
model = Prophet(**best_params) # Change to Prophet() if 3.4 skipped
model.fit(data)
# generate dates for prediction (12 time periods ahead of the latest datapoint in the data time)
future_dates = model.make_future_dataframe(periods=12, freq="5min", include_history=False)
# predcit eth values on future_dates
forecast = model.predict(future_dates)
pred_vals = forecast.set_index('ds')['yhat'].to_numpy()
From Challenge 5, do:
- Publish & share predictions