GitHub - dream-faster/fold: 🪁 A fast Adaptive Machine Learning library for Time-Series, that lets you build, deploy and update composite models easily. An order of magnitude speed-up, combined with flexibility and rigour. This is an internal project - documentation is not updated anymore and substantially differ from the current API.

FOLD

Fast Adaptive Time Series ML Engine
This is an internal project - documentation is not updated anymore and substantially differ from the current API. Explore the docs »

The Adaptive ML Engine that lets you build, deploy and update Models easily. An order of magnitude speed-up, combined with flexibility and rigour.

Main Features

10x faster Adaptive Backtesting - What does that mean?
Composite Models made Adaptive - What does that mean?
Distributed computing - Why is this important?
Update deployed models (coming in May) - Why is this important?

Installation

Prerequisites: python >= 3.8 and pip
Install from pypi:
```
pip install fold-core
```

Quickstart

You can quickly train your chosen models and get predictions by running:

from sklearn.ensemble import RandomForestRegressor
from statsforecast.models import ARIMA
from fold import ExpandingWindowSplitter, train_evaluate
from fold.composites import Ensemble
from fold.transformations import OnlyPredictions
from fold.utils.dataset import get_preprocessed_dataset

X, y = get_preprocessed_dataset(
    "weather/historical_hourly_la", target_col="temperature", shorten=1000
)

pipeline = [
    Ensemble(
        [
            RandomForestRegressor(),
            ARIMA(order=(1, 1, 0)),
        ]
    ),
    OnlyPredictions(),
]
splitter = ExpandingWindowSplitter(initial_train_window=0.2, step=0.2)
scorecard, prediction, trained_pipelines, _, _ = train_evaluate(pipeline, X, y, splitter)

(If you install krisi by running pip install krisi you get an extended report back, rather than a single metric.)

Fold is different

Adaptive Models and Backtesting at lightning speed.
→ fold allows to simulate and evaluate your models like they would have performed, in reality/when deployed, with clever use of paralellization and design.
Create composite models: ensembles, hybrids, stacking pipelines, easily.
→ Underutilized, but the easiest, fastest way to increase performance of your Time Series models.
Built with Distributed Computing in mind.
→ Deploy your research and development pipelines to a cluster with ray, and use modin to handle out-of-memory datasets (full support for modin is coming in April).
Bridging the gap between Online and Mini-Batch learning.
→ Mix and match xgboost with ARIMA, in a single pipeline. Boost your model's accuracy by updating them on every timestamp, if desired.
Update your deployed models, easily, as new data flows in.
→ Real world is not static. Let your models adapt, without the need to re-train from scratch.

Examples, Walkthroughs and Blog Posts

Name	Type	Dataset Type	Docs Link	Colab
⚡️ Core Walkthrough	Walkthrough	Energy	Notebook	Colab
🚄 Speed Comparison of Fold to other libraries	Walkthrough	Weather	Notebook	Colab
📚 Example Collection	Example	Weather & Synthetic	Collection Link	-
🖋️ Why we ended up building an Adaptive ML engine for Time Series	Blog	Public Release Blog Post	Blog post on Applied Exploration	-

Core Features

Supports both Regression and Classification tasks.
Online and Mini-batch learning.
Feature selection and other transformations on an expanding/rolling window basis
Use any scikit-learn/tabular model natively!
Use any univariate or sequence models (wrappers provided in fold-wrappers).
Use any Deep Learning Time Series models (wrappers provided in fold-wrappers).
Super easy syntax!
Probabilistic foreacasts (currently, for Classification, full support coming in April).
Hyperparemeter optimization / Model selection. (coming in early April!)

What is Adaptive Backtesting?

It's like classical Backtesting / Time Series Cross-Validation, plus: Inside a test window, and during deployment, fold provides a way for models to update their parameters or access the last value. Learn more

Our Open-core Time Series Toolkit

Explore our Commercial License options here

Contribution

Join our for live discussion!

Submit an issue or reach out to us on info at dream-faster.ai for any inquiries.

Licence & Usage

We want to bring much-needed transparency, speed and rigour to the process of creating Time Series ML pipelines, while also building a sustainable business, that can support the ecosystem in the long-term. Fold's licence is inbetween source-available and a traditional commercial software licence. It requires a paid licence for any commercial use, after the initial, 30 day trial period.

We also want to contribute to open research by giving free access to non-commercial, research use of fold.

Limitations

No intermittent time series support, very limited support for missing values.
No hierarchical time series support.

Name		Name	Last commit message	Last commit date
Latest commit History 453 Commits
.github		.github
docs		docs
src/fold		src/fold
tests		tests
.flake8		.flake8
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE		LICENSE
README.md		README.md
codecov.yml		codecov.yml
copyright.tmpl		copyright.tmpl
mkdocs.yml		mkdocs.yml
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FOLD

Main Features

Installation

Quickstart

Fold is different

Examples, Walkthroughs and Blog Posts

Core Features

What is Adaptive Backtesting?

Our Open-core Time Series Toolkit

Contribution

Licence & Usage

Limitations

About

Releases 26

Contributors 3

Languages

License

dream-faster/fold

Folders and files

Latest commit

History

Repository files navigation

FOLD

Main Features

Installation

Quickstart

Fold is different

Examples, Walkthroughs and Blog Posts

Core Features

What is Adaptive Backtesting?

Our Open-core Time Series Toolkit

Contribution

Licence & Usage

Limitations

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 26

Contributors 3

Languages