Authors: Aditya Shrey and Arnav Chahal
This repository contains code and experiments from our research on combining multiple kernels within a Sparse Gaussian Process (GP) framework for correlated web traffic forecasting. Our approach leverages a variety of kernels—Squared Exponential, Spectral Mixture, Matérn, Linear, and Sinusoidal—to capture complex temporal patterns. By optimizing the Evidence Lower Bound (ELBO), we tune kernel weights and hyperparameters, investigating how each kernel contributes to forecast accuracy and uncertainty estimation.
-
exp_inducing_points.ipynb
Experiment investigating how varying the number of inducing points affects the model’s performance and efficiency. -
exp_kernel_weights.ipynb
Experiment testing kernel optimization on filtered datasets (Soccer, Politics, and Technology) to identify which kernels play a more dominant role. -
exp_step_size.ipynb
Experiment exploring the impact of step size in ELBO maximization on convergence, predictive performance, and uncertainty calibration. -
kernels.py
Contains implementations of various kernels:- Squared Exponential (SE)
- Spectral Mixture (SM)
- Matérn
- Linear
- Sinusoidal
-
data.py
Data preprocessing and manipulation methods, including:- Splitting input-output matrices
- Cleaning and filtering data (e.g., median filtering)
- Optional normalization
-
plot.py
Visualization utilities to generate plots of time series, forecasts, ELBO curves, and kernel weight distributions. -
test_kernels.ipynb
Preliminary tests ensuring correct kernel implementations. -
test_simple2D.ipynb
A toy experiment using a simple 2D dataset to verify the pipeline before applying it to more complex web traffic data. -
sparse_gp.py
Implementation of the Sparse Gaussian Process and ELBO-based variational inference, including:- Variational optimization of inducing points and kernel hyperparameters
- Computation of ELBO, predictive distributions, and other components of the GP framework
We combine multiple kernels to capture different aspects of time series behavior. Some kernels model smooth variations, others handle periodicity or complex frequency structures. Below are covariance heatmaps for two illustrative kernels:
Sinusoidal Kernel Covariance Heatmap:
Captures periodic patterns in the data.

Spectral Mixture Kernel Covariance Heatmap:
Handles multi-periodic or complex frequency patterns through a Gaussian mixture in the spectral domain.

Traditional GPs scale poorly for large datasets with complexity on the order of O(N^3). Sparse GPs address this by using a set of inducing points M (with M << N) to reduce computational complexity to approximately O(M^2 N).
Inducing Points Visualization:
This figure conceptually shows how inducing points represent a compressed summary of the data, balancing complexity and scalability.

We apply our methods to subsets of the Wikipedia Traffic Data Exploration dataset (2015-2017). Our focus is on correlated web traffic time series—e.g., English Premier League soccer clubs, political figures, and major technology companies.
To handle outliers or extreme behavior, we experimented with median filtering. While it improved performance in some datasets, certain datasets like the unfiltered soccer data still yielded strong forecasts, potentially because there were no extreme anomalies to remove.
-
Toy Example (Simple 2D Data):
Before tackling real-world complexity, we validated our approach on a simple toy dataset. This ensures that our code and methods function correctly in a controlled scenario.

-
Soccer Dataset (Unfiltered):
On this raw dataset—without median filtering—our model could still identify underlying patterns. This suggests that when data aren’t plagued by severe outliers, filtering may not be necessary.

In contrast, for datasets like Politics or Technology (not shown here), median filtering helped stabilize the model due to more erratic search volume patterns. The experiments showed that:
- Some kernels never fully dropped to zero weight, indicating even less dominant kernels still offered incremental improvements.
- Step size tuning affected how confidently (and how accurately) the model predicted.
- Varying inducing points had less impact than anticipated, suggesting a broad robustness in how the sparse GP model leveraged them.
-
Data Preparation:
Usedata.pyto preprocess your dataset into the required format. -
Running Experiments:
exp_step_size.ipynb: Test different step sizes.exp_inducing_points.ipynb: Vary inducing points.exp_kernel_weights.ipynb: Optimize kernel weights on filtered datasets.
For initial checks:
test_kernels.ipynb: Verify kernel implementations.test_simple2D.ipynb: Run a toy scenario.
-
Visualization:
Useplot.pyto generate plots anddata_imgs/for reference images or saved figures.