Skip to content

Commit

Permalink
update tutorial
Browse files Browse the repository at this point in the history
  • Loading branch information
mastoffel committed Nov 27, 2024
1 parent 8a6c159 commit 864127c
Showing 1 changed file with 4 additions and 4 deletions.
8 changes: 4 additions & 4 deletions docs/tutorials/01_start.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -813,9 +813,9 @@
"source": [
"Although we tried to chose default model parameters that work well in a wide range of scenarios, hyperparameter search will often find an emulator model with a better fit. Internally, `AutoEmulate` compares the performance of different models and hyperparameters using cross-validation on the training data, which can be computationally expensive and time-consuming for larger datasets. To speed it up, we can parallelise the process with `n_jobs`.\n",
"\n",
"For each model, we've pre-defined a search space for hyperparameters. When setting up `AutoEmulate` with `param_search=True`, we default to using random search with `param_search_iters = 20` iterations. We plan to add other hyperparameter search methods in the future. \n",
"For each model, we've pre-defined a search space for hyperparameters. When setting up `AutoEmulate` with `param_search=True`, we default to using random search with `param_search_iters = 20` iterations. This means that 20 hyperparameter combinations from the search space are sampled and evaluated. We plan to add other hyperparameter search methods in the future. \n",
"\n",
"Let's do a hyperparameter search for the Gaussian Process and Random Forest models."
"Let's do a hyperparameter search for the Support Vector Machines and Random Forest models."
]
},
{
Expand Down Expand Up @@ -1352,7 +1352,7 @@
],
"source": [
"em = AutoEmulate()\n",
"em.setup(X, y, param_search=True, param_search_type=\"random\", param_search_iters=20, models=[\"GaussianProcess\", \"RandomForest\"], n_jobs=-2) # n_jobs=-2 uses all cores but one\n",
"em.setup(X, y, param_search=True, param_search_type=\"random\", param_search_iters=10, models=[\"SupportVectorMachines\", \"RandomForest\"], n_jobs=-2) # n_jobs=-2 uses all cores but one\n",
"em.compare()"
]
},
Expand Down Expand Up @@ -1427,7 +1427,7 @@
"metadata": {},
"source": [
"**Notes**: \n",
"* Some models, such as `GaussianProcess` can be slow to run hyperparameter search on larger datasets (say n > 1500). \n",
"* Some models, such as `GaussianProcess` can be slow when conducting hyperparameter search on larger datasets (say n > 1000). \n",
"* Use the `models` argument to only run hyperparameter search on a subset of models to speed up the process.\n",
"* When possible, use `n_jobs` to parallelise the hyperparameter search. With larger datasets, we recommend setting `param_search_iters` to a lower number, such as 5, to see how long it takes to run and then increase it if necessary.\n",
"* all models can be specified with short names too, such as `rf` for `RandomForest`, `gp` for `GaussianProcess`, `svm` for `SupportVectorMachines`, etc"
Expand Down

0 comments on commit 864127c

Please sign in to comment.