Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
33 changes: 33 additions & 0 deletions .github/workflows/release.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
name: Push to PyPi

on:
push:
branches:
- master

jobs:
test:
runs-on: "ubuntu-latest"

steps:
- name: Checkout source
uses: actions/checkout@v2

- name: Set up Python 3.8
uses: actions/setup-python@v1
with:
python-version: 3.8

- name: Install build dependencies
run: python -m pip install build wheel

- name: Build distributions
shell: bash -l {0}
run: python setup.py sdist bdist_wheel

- name: Publish package to PyPI
if: github.repository == 'automl/Auto-PyTorch' && github.event_name == 'push' && startsWith(github.ref, 'refs/tags')
uses: pypa/gh-action-pypi-publish@master
with:
user: __token__
password: ${{ secrets.pypi_token }}
19 changes: 19 additions & 0 deletions CITATION.cff
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
preferred-citation:
type: article
authors:
- family-names: "Zimmer"
given-names: "Lucas"
affiliation: "University of Freiburg, Germany"
- family-names: "Lindauer"
given-names: "Marius"
affiliation: "University of Freiburg, Germany"
- family-names: "Hutter"
given-names: "Frank"
affiliation: "University of Freiburg, Germany"
doi: "10.1109/TPAMI.2021.3067763"
journal-title: "IEEE Transactions on Pattern Analysis and Machine Intelligence"
title: "Auto-PyTorch Tabular: Multi-Fidelity MetaLearning for Efficient and Robust AutoDL"
year: 2021
note: "also available under https://arxiv.org/abs/2006.13799"
start: 3079
end: 3090
60 changes: 52 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,14 +1,14 @@
# Auto-PyTorch

Copyright (C) 2019 [AutoML Group Freiburg](http://www.automl.org/)
Copyright (C) 2021 [AutoML Groups Freiburg and Hannover](http://www.automl.org/)

This an alpha version of Auto-PyTorch with improved API.
So far, Auto-PyTorch supports tabular data (classification, regression).
We plan to enable image data and time-series data.
While early AutoML frameworks focused on optimizing traditional ML pipelines and their hyperparameters, another trend in AutoML is to focus on neural architecture search. To bring the best of these two worlds together, we developed **Auto-PyTorch**, which jointly and robustly optimizes the network architecture and the training hyperparameters to enable fully automated deep learning (AutoDL).

Auto-PyTorch is mainly developed to support tabular data (classification, regression).
The newest features in Auto-PyTorch for tabular data are described in the paper ["Auto-PyTorch Tabular: Multi-Fidelity MetaLearning for Efficient and Robust AutoDL"](https://arxiv.org/abs/2006.13799) (see below for bibtex ref).

Find the documentation [here](https://automl.github.io/Auto-PyTorch/development)

***From v0.1.0, AutoPyTorch has been updated to further improve usability, robustness and efficiency by using SMAC as the underlying optimization package as well as changing the code structure. Therefore, moving from v0.0.2 to v0.1.0 will break compatibility.
In case you would like to use the old API, you can find it at [`master_old`](https://github.com/automl/Auto-PyTorch/tree/master-old).***

## Installation

Expand All @@ -33,6 +33,50 @@ python setup.py install

```

## Examples

In a nutshell:

```py
from autoPyTorch.api.tabular_classification import TabularClassificationTask

# data and metric imports
import sklearn.model_selection
import sklearn.datasets
import sklearn.metrics
X, y = sklearn.datasets.load_digits(return_X_y=True)
X_train, X_test, y_train, y_test = \
sklearn.model_selection.train_test_split(X, y, random_state=1)

# initialise Auto-PyTorch api
api = TabularClassificationTask()

# Search for an ensemble of machine learning algorithms
api.search(
X_train=X_train,
y_train=y_train,
X_test=X_test,
y_test=y_test,
optimize_metric='accuracy',
total_walltime_limit=300,
func_eval_time_limit_secs=50
)

# Calculate test accuracy
y_pred = api.predict(X_test)
score = api.score(y_pred, y_test)
print("Accuracy score", score)
```

For more examples including customising the search space, parellising the code, etc, checkout the `examples` folder

```sh
$ cd examples/
```


Code for the [paper](https://arxiv.org/abs/2006.13799) is available under `examples/ensemble` in the [TPAMI.2021.3067763](https://github.com/automl/Auto-PyTorch/tree/TPAMI.2021.3067763`) branch.

## Contributing

If you want to contribute to Auto-PyTorch, clone the repository and checkout our current development branch
Expand Down Expand Up @@ -63,8 +107,8 @@ Please refer to the branch `TPAMI.2021.3067763` to reproduce the paper *Auto-PyT
title = {Auto-PyTorch Tabular: Multi-Fidelity MetaLearning for Efficient and Robust AutoDL},
journal = {IEEE Transactions on Pattern Analysis and Machine Intelligence},
year = {2021},
note = {IEEE early access; also available under https://arxiv.org/abs/2006.13799},
pages = {1-12}
note = {also available under https://arxiv.org/abs/2006.13799},
pages = {3079 - 3090}
}
```

Expand Down
16 changes: 16 additions & 0 deletions autoPyTorch/api/base_task.py
Original file line number Diff line number Diff line change
Expand Up @@ -762,6 +762,7 @@ def _search(
budget_type (str):
Type of budget to be used when fitting the pipeline.
It can be one of:

+ `epochs`: The training of each pipeline will be terminated after
a number of epochs have passed. This number of epochs is determined by the
budget argument of this method.
Expand Down Expand Up @@ -840,6 +841,21 @@ def _search(
Numeric precision used when loading ensemble data.
Can be either '16', '32' or '64'.
disable_file_output (Union[bool, List]):
If True, disable model and prediction output.
Can also be used as a list to pass more fine-grained
information on what to save. Allowed elements in the list are:

+ `y_optimization`:
do not save the predictions for the optimization set,
which would later on be used to build an ensemble. Note that SMAC
optimizes a metric evaluated on the optimization set.
+ `pipeline`:
do not save any individual pipeline files
+ `pipelines`:
In case of cross validation, disables saving the joint model of the
pipelines fit on each fold.
+ `y_test`:
do not save the predictions for the test set.
load_models (bool: default=True):
Whether to load the models after fitting AutoPyTorch.
portfolio_selection (Optional[str]):
Expand Down
16 changes: 16 additions & 0 deletions autoPyTorch/api/tabular_classification.py
Original file line number Diff line number Diff line change
Expand Up @@ -159,6 +159,7 @@ def search(
budget_type (str):
Type of budget to be used when fitting the pipeline.
It can be one of:

+ `epochs`: The training of each pipeline will be terminated after
a number of epochs have passed. This number of epochs is determined by the
budget argument of this method.
Expand Down Expand Up @@ -237,6 +238,21 @@ def search(
Numeric precision used when loading ensemble data.
Can be either '16', '32' or '64'.
disable_file_output (Union[bool, List]):
If True, disable model and prediction output.
Can also be used as a list to pass more fine-grained
information on what to save. Allowed elements in the list are:

+ `y_optimization`:
do not save the predictions for the optimization set,
which would later on be used to build an ensemble. Note that SMAC
optimizes a metric evaluated on the optimization set.
+ `pipeline`:
do not save any individual pipeline files
+ `pipelines`:
In case of cross validation, disables saving the joint model of the
pipelines fit on each fold.
+ `y_test`:
do not save the predictions for the test set.
load_models (bool: default=True):
Whether to load the models after fitting AutoPyTorch.
portfolio_selection (Optional[str]):
Expand Down
20 changes: 18 additions & 2 deletions autoPyTorch/api/tabular_regression.py
Original file line number Diff line number Diff line change
Expand Up @@ -160,6 +160,7 @@ def search(
budget_type (str):
Type of budget to be used when fitting the pipeline.
It can be one of:

+ `epochs`: The training of each pipeline will be terminated after
a number of epochs have passed. This number of epochs is determined by the
budget argument of this method.
Expand All @@ -173,15 +174,15 @@ def search(
is used, min_budget will refer to epochs whereas if budget_type=='runtime' then
min_budget will refer to seconds.
min_budget (int):
Auto-PyTorch uses `Hyperband <https://arxiv.org/abs/1603.06560>_` to
Auto-PyTorch uses `Hyperband <https://arxiv.org/abs/1603.06560>`_ to
trade-off resources between running many pipelines at min_budget and
running the top performing pipelines on max_budget.
min_budget states the minimum resource allocation a pipeline should have
so that we can compare and quickly discard bad performing models.
For example, if the budget_type is epochs, and min_budget=5, then we will
run every pipeline to a minimum of 5 epochs before performance comparison.
max_budget (int):
Auto-PyTorch uses `Hyperband <https://arxiv.org/abs/1603.06560>_` to
Auto-PyTorch uses `Hyperband <https://arxiv.org/abs/1603.06560>`_ to
trade-off resources between running many pipelines at min_budget and
running the top performing pipelines on max_budget.
max_budget states the maximum resource allocation a pipeline is going to
Expand Down Expand Up @@ -238,6 +239,21 @@ def search(
Numeric precision used when loading ensemble data.
Can be either '16', '32' or '64'.
disable_file_output (Union[bool, List]):
If True, disable model and prediction output.
Can also be used as a list to pass more fine-grained
information on what to save. Allowed elements in the list are:

+ `y_optimization`:
do not save the predictions for the optimization set,
which would later on be used to build an ensemble. Note that SMAC
optimizes a metric evaluated on the optimization set.
+ `pipeline`:
do not save any individual pipeline files
+ `pipelines`:
In case of cross validation, disables saving the joint model of the
pipelines fit on each fold.
+ `y_test`:
do not save the predictions for the test set.
load_models (bool: default=True):
Whether to load the models after fitting AutoPyTorch.
portfolio_selection (Optional[str]):
Expand Down
2 changes: 2 additions & 0 deletions docs/extending.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,3 +5,5 @@
======================
Extending Auto-PyTorch
======================

TODO
2 changes: 1 addition & 1 deletion setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@
name="autoPyTorch",
version="0.1.0",
author="AutoML Freiburg",
author_email="[email protected]",
author_email="[email protected]",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't it be a freiburg email address?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I asked for his email address for this purpose and that's what he gave me. I think its fine.

description=("Auto-PyTorch searches neural architectures using smac"),
long_description=long_description,
url="https://github.com/automl/Auto-PyTorch",
Expand Down