update tutorial

mastoffel · mastoffel · commit 864127c2c427 · 2024-11-27T15:26:32.000Z
diff --git a/docs/tutorials/01_start.ipynb b/docs/tutorials/01_start.ipynb
@@ -813,9 +813,9 @@
    "source": [
     "Although we tried to chose default model parameters that work well in a wide range of scenarios, hyperparameter search will often find an emulator model with a better fit. Internally, `AutoEmulate` compares the performance of different models and hyperparameters using cross-validation on the training data, which can be computationally expensive and time-consuming for larger datasets. To speed it up, we can parallelise the process with `n_jobs`.\n",
     "\n",
-    "For each model, we've pre-defined a search space for hyperparameters. When setting up `AutoEmulate` with `param_search=True`, we default to using random search with `param_search_iters = 20` iterations. We plan to add other hyperparameter search methods in the future. \n",
+    "For each model, we've pre-defined a search space for hyperparameters. When setting up `AutoEmulate` with `param_search=True`, we default to using random search with `param_search_iters = 20` iterations. This means that 20 hyperparameter combinations from the search space are sampled and evaluated. We plan to add other hyperparameter search methods in the future. \n",
     "\n",
-    "Let's do a hyperparameter search for the Gaussian Process and Random Forest models."
+    "Let's do a hyperparameter search for the Support Vector Machines and Random Forest models."
    ]
   },
   {
@@ -1352,7 +1352,7 @@
    ],
    "source": [
     "em = AutoEmulate()\n",
-    "em.setup(X, y, param_search=True, param_search_type=\"random\", param_search_iters=20, models=[\"GaussianProcess\", \"RandomForest\"], n_jobs=-2) # n_jobs=-2 uses all cores but one\n",
+    "em.setup(X, y, param_search=True, param_search_type=\"random\", param_search_iters=10, models=[\"SupportVectorMachines\", \"RandomForest\"], n_jobs=-2) # n_jobs=-2 uses all cores but one\n",
     "em.compare()"
    ]
   },
@@ -1427,7 +1427,7 @@
    "metadata": {},
    "source": [
     "**Notes**: \n",
-    "* Some models, such as `GaussianProcess` can be slow to run hyperparameter search on larger datasets (say n > 1500). \n",
+    "* Some models, such as `GaussianProcess` can be slow when conducting hyperparameter search on larger datasets (say n > 1000). \n",
     "* Use the `models` argument to only run hyperparameter search on a subset of models to speed up the process.\n",
     "* When possible, use `n_jobs` to parallelise the hyperparameter search. With larger datasets, we recommend setting `param_search_iters` to a lower number, such as 5, to see how long it takes to run and then increase it if necessary.\n",
     "* all models can be specified with short names too, such as `rf` for `RandomForest`, `gp` for `GaussianProcess`, `svm` for `SupportVectorMachines`, etc"