|
127 | 127 | " 'base_url': '<your first Azure OpenAI API base here>',\n",
|
128 | 128 | " 'api_type': 'azure',\n",
|
129 | 129 | " 'api_version': '2023-06-01-preview',\n",
|
130 |
| - " }, # only if the at least one Azure OpenAI API key is found\n", |
| 130 | + " }, # only if at least one Azure OpenAI API key is found\n", |
131 | 131 | " {\n",
|
132 | 132 | " 'api_key': '<your second Azure OpenAI API key here>',\n",
|
133 | 133 | " 'base_url': '<your second Azure OpenAI API base here>',\n",
|
|
277 | 277 | "source": [
|
278 | 278 | "## Define Success Metric\n",
|
279 | 279 | "\n",
|
280 |
| - "Before we start tuning, we need to define the success metric we want to optimize. For each math task, we use voting to select a response with the most common answers out of all the generated responses. If it has an equivalent answer to the canonical solution, we consider the task as successfully solved. Then we can optimize the mean success rate of a collection of tasks." |
| 280 | + "Before we start tuning, we must define the success metric we want to optimize. For each math task, we use voting to select a response with the most common answers out of all the generated responses. We consider the task successfully solved if it has an equivalent answer to the canonical solution. Then we can optimize the mean success rate of a collection of tasks." |
281 | 281 | ]
|
282 | 282 | },
|
283 | 283 | {
|
|
346 | 346 | "\n",
|
347 | 347 | "The tuning will take a while to finish, depending on the optimization budget. The tuning will be performed under the specified optimization budgets.\n",
|
348 | 348 | "\n",
|
349 |
| - "* `inference_budget` is the target average inference budget per instance in the benchmark. For example, 0.004 means the target inference budget is 0.004 dollars, which translates to 2000 tokens (input + output combined) if the gpt-3.5-turbo model is used.\n", |
350 |
| - "* `optimization_budget` is the total budget allowed to perform the tuning. For example, 1 means 1 dollars are allowed in total, which translates to 500K tokens for the gpt-3.5-turbo model.\n", |
351 |
| - "* `num_sumples` is the number of different hyperparameter configurations which is allowed to try. The tuning will stop after either num_samples trials or after optimization_budget dollars spent, whichever happens first. -1 means no hard restriction in the number of trials and the actual number is decided by `optimization_budget`.\n", |
| 349 | + "* `inference_budget` is the benchmark's target average inference budget per instance. For example, 0.004 means the target inference budget is 0.004 dollars, which translates to 2000 tokens (input + output combined) if the gpt-3.5-turbo model is used.\n", |
| 350 | + "* `optimization_budget` is the total budget allowed for tuning. For example, 1 means 1 dollar is allowed in total, which translates to 500K tokens for the gpt-3.5-turbo model.\n", |
| 351 | + "* `num_sumples` is the number of different hyperparameter configurations allowed to be tried. The tuning will stop after either num_samples trials are completed or optimization_budget dollars are spent, whichever happens first. -1 means no hard restriction in the number of trials and the actual number is decided by `optimization_budget`.\n", |
352 | 352 | "\n",
|
353 | 353 | "Users can specify tuning data, optimization metric, optimization mode, evaluation function, search spaces etc.. The default search space is:\n",
|
354 | 354 | "\n",
|
|
371 | 371 | "```\n",
|
372 | 372 | "\n",
|
373 | 373 | "The default search space can be overridden by users' input.\n",
|
374 |
| - "For example, the following code specifies a fixed prompt template. For hyperparameters which don't appear in users' input, the default search space will be used." |
| 374 | + "For example, the following code specifies a fixed prompt template. The default search space will be used for hyperparameters that don't appear in users' input." |
375 | 375 | ]
|
376 | 376 | },
|
377 | 377 | {
|
|
0 commit comments