[Inference API] Add Docs for Amazon Bedrock Support for the Inference API by markjhoy · Pull Request #110594 · elastic/elasticsearch

markjhoy · 2024-07-08T15:10:22Z

Add docs in support of Amazon Bedrock support in the Inference API: #110248

github-actions · 2024-07-08T15:10:33Z

Documentation preview:

✨ Changed pages

timgrein

Just highlighting some Azure AI Studio references (Sorry, I just saw that this is draft, but leaving the comments here, so we don't forget it :) )

timgrein · 2024-07-08T15:22:02Z

+Creates an {infer} endpoint to perform an {infer} task with the `amazonbedrock` service.
+
+[discrete]
+[[infer-service-azure-ai-studio-api-request]]


Suggested change

[[infer-service-azure-ai-studio-api-request]]

[[infer-service-amazon-bedrock-api-request]]

timgrein · 2024-07-08T15:22:18Z

+`PUT /_inference/<task_type>/<inference_id>`
+
+[discrete]
+[[infer-service-azure-ai-studio-api-path-params]]


Suggested change

[[infer-service-azure-ai-studio-api-path-params]]

[[infer-service-amazon-bedrock-path-params]]

timgrein · 2024-07-08T15:24:19Z

+
+`rate_limit`:::
+(Optional, object)
+By default, the `azureaistudio` service sets the number of requests allowed per minute to `240`.


Suggested change

By default, the `azureaistudio` service sets the number of requests allowed per minute to `240`.

By default, the `amazonbedrock` service sets the number of requests allowed per minute to `240`.

timgrein · 2024-07-08T15:24:28Z

+`rate_limit`:::
+(Optional, object)
+By default, the `azureaistudio` service sets the number of requests allowed per minute to `240`.
+This helps to minimize the number of rate limit errors returned from Azure AI Studio.


Suggested change

This helps to minimize the number of rate limit errors returned from Azure AI Studio.

This helps to minimize the number of rate limit errors returned from Amazon Bedrock.

Argh - great catches ;) That's what I get for copy / pasting

markjhoy · 2024-07-08T17:17:28Z

@elasticmachine run docs build

elasticsearchmachine · 2024-07-08T17:24:00Z

Pinging @elastic/es-docs (Team:Docs)

leemthompo · 2024-07-11T09:55:57Z

+
+.`task_settings` for the `text_embedding` task type
+[%collapsible%closed]
+=====


@markjhoy I think this unclosed ==== block might be breaking your build :)

Ah thanks! I could not figure out for the life of me where that error was coming from!

leemthompo

This is looking good. Found a few minor errors and suggested some rephrasings. Also hopefully identified the formatting issues that's failing docs build. Once these updates are made and we can preview the formatting for tabs, will be ready for final review!

leemthompo · 2024-07-11T10:04:43Z

+A number in the range of 0.0 to 1.0 that is an alternative value to temperature that causes the model to consider the results of the tokens with nucleus sampling probability.
+Should not be used if `temperature` or `top_k` is specified.
+
+`top_p`:::


Suggested change

`top_p`:::

`top_k`:::

Assuming the first top_p is the correct one 😉

leemthompo · 2024-07-11T10:07:30Z

+
+`max_new_tokens`:::
+(Optional, integer)
+Provides a hint for the maximum number of output tokens to be generated.


Suggested change

Provides a hint for the maximum number of output tokens to be generated.

Sets a maximum number for the output tokens to be generated.

Not sure what "hint" means here, rewording tries to clarify

leemthompo · 2024-07-11T10:12:05Z

+
+`temperature`:::
+(Optional, float)
+A number in the range of 0.0 to 1.0 that specifies the sampling temperature to use that controls the apparent creativity of generated completions.


Suggested change

A number in the range of 0.0 to 1.0 that specifies the sampling temperature to use that controls the apparent creativity of generated completions.

A number between 0.0 and 1.0 that controls the apparent creativity of the results. At temperature 0.0 the model is most deterministic, at temperature 1.0 most random.

leemthompo · 2024-07-11T10:14:04Z

+
+`top_p`:::
+(Optional, float)
+A number in the range of 0.0 to 1.0 that is an alternative value to temperature that causes the model to consider the results of the tokens with nucleus sampling probability.


Suggested change

A number in the range of 0.0 to 1.0 that is an alternative value to temperature that causes the model to consider the results of the tokens with nucleus sampling probability.

Alternative to `temperature`. A number in the range of 0.0 to 1.0, to eliminate low-probability tokens. Top-p uses nucleus sampling to select top tokens whose sum of likelihoods does not exceed a certain value, ensuring both variety and coherence.

leemthompo · 2024-07-11T10:22:09Z

+`top_p`:::
+(Optional, float)
+A number in the range of 0.0 to 1.0 that is an alternative value to temperature that causes the model to consider the results of the tokens with nucleus sampling probability.
+Should not be used if `temperature` or `top_k` is specified.


Reading around it looks like top-p and top-k can be used in combination?

FYI - you're correct here... theoretically, you can use all three, but you shouldn't use temperature and top_p at the same time. For reference, see the parameters in Amazon's Anthropic docs

leemthompo · 2024-07-11T10:25:21Z

+`top_p`:::
+(Optional, float)
+Only available for `anthropic`, `cohere`, and `mistral` providers.
+A number in the range of 0.0 to 1.0 that is an alternative value to temperature or top_p that causes the model to consider the results of the tokens with nucleus sampling probability.


Suggested change

A number in the range of 0.0 to 1.0 that is an alternative value to temperature or top_p that causes the model to consider the results of the tokens with nucleus sampling probability.

Alternative to `temperature`. Limits samples to the top-K most likely words, balancing coherence and variability.

A number in the range of 0.0 to 1.0.

leemthompo · 2024-07-11T10:29:44Z

+
+The following example shows how to create an {infer} endpoint called `amazon_bedrock_embeddings` to perform a `text_embedding` task type.
+
+The list of chat completion and embeddings models that you can choose from should be a https://docs.aws.amazon.com/bedrock/latest/userguide/model-ids.html[Amazon Bedrock base model] you have access to.


Suggested change

The list of chat completion and embeddings models that you can choose from should be a https://docs.aws.amazon.com/bedrock/latest/userguide/model-ids.html[Amazon Bedrock base model] you have access to.

Choose chat completion and embeddings models you have access to from the https://docs.aws.amazon.com/bedrock/latest/userguide/model-ids.html[Amazon Bedrock base models].

nit: keep sentence short

leemthompo

LGTM from writing perspective!

elasticsearchmachine · 2024-07-12T14:16:11Z

💚 Backport successful

Status	Branch	Result
✅	8.15

… API (#110594) (#110832) * Add Amazon Bedrock Inference API to docs * fix example errors * update semantic search tutorial; add changelog * fix typo * fix error; accept suggestions

… API (#110594) * Add Amazon Bedrock Inference API to docs * fix example errors * update semantic search tutorial; add changelog * fix typo * fix error; accept suggestions

Add Amazon Bedrock Inference API to docs

dbc377f

elasticsearchmachine added the v8.16.0 label Jul 8, 2024

fix example errors

ca82238

timgrein reviewed Jul 8, 2024

View reviewed changes

update semantic search tutorial; add changelog

9fb4775

fix typo

1e7f832

markjhoy marked this pull request as ready for review July 8, 2024 17:21

elasticsearchmachine added the needs:triage Requires assignment of a team area label label Jul 8, 2024

markjhoy added >docs General docs changes >non-issue Team:Docs Meta label for docs team Team:ML Meta label for the ML team auto-backport-and-merge v8.15.0 labels Jul 8, 2024

elasticsearchmachine removed Team:ML Meta label for the ML team needs:triage Requires assignment of a team area label labels Jul 8, 2024

markjhoy requested review from leemthompo and timgrein July 9, 2024 00:18

leemthompo reviewed Jul 11, 2024

View reviewed changes

fix error; accept suggestions

07afa0e

markjhoy requested a review from leemthompo July 11, 2024 17:06

leemthompo approved these changes Jul 12, 2024

View reviewed changes

markjhoy merged commit 560d404 into elastic:main Jul 12, 2024

markjhoy mentioned this pull request Jul 12, 2024

[8.15] [Inference API] Add Docs for Amazon Bedrock Support for the Inference API (#110594) #110832

Merged

lkts mentioned this pull request Aug 13, 2024

Fix references to logsdb index mode in release highlights lkts/elasticsearch#1

Closed

	[[infer-service-azure-ai-studio-api-request]]
	[[infer-service-amazon-bedrock-api-request]]

	[[infer-service-azure-ai-studio-api-path-params]]
	[[infer-service-amazon-bedrock-path-params]]

	By default, the `azureaistudio` service sets the number of requests allowed per minute to `240`.
	By default, the `amazonbedrock` service sets the number of requests allowed per minute to `240`.

	This helps to minimize the number of rate limit errors returned from Azure AI Studio.
	This helps to minimize the number of rate limit errors returned from Amazon Bedrock.

	Provides a hint for the maximum number of output tokens to be generated.
	Sets a maximum number for the output tokens to be generated.

	A number in the range of 0.0 to 1.0 that specifies the sampling temperature to use that controls the apparent creativity of generated completions.
	A number between 0.0 and 1.0 that controls the apparent creativity of the results. At temperature 0.0 the model is most deterministic, at temperature 1.0 most random.

	A number in the range of 0.0 to 1.0 that is an alternative value to temperature that causes the model to consider the results of the tokens with nucleus sampling probability.
	Alternative to `temperature`. A number in the range of 0.0 to 1.0, to eliminate low-probability tokens. Top-p uses nucleus sampling to select top tokens whose sum of likelihoods does not exceed a certain value, ensuring both variety and coherence.

	A number in the range of 0.0 to 1.0 that is an alternative value to temperature or top_p that causes the model to consider the results of the tokens with nucleus sampling probability.
	Alternative to `temperature`. Limits samples to the top-K most likely words, balancing coherence and variability.
	A number in the range of 0.0 to 1.0.


		The following example shows how to create an {infer} endpoint called `amazon_bedrock_embeddings` to perform a `text_embedding` task type.

		The list of chat completion and embeddings models that you can choose from should be a https://docs.aws.amazon.com/bedrock/latest/userguide/model-ids.html[Amazon Bedrock base model] you have access to.

	The list of chat completion and embeddings models that you can choose from should be a https://docs.aws.amazon.com/bedrock/latest/userguide/model-ids.html[Amazon Bedrock base model] you have access to.
	Choose chat completion and embeddings models you have access to from the https://docs.aws.amazon.com/bedrock/latest/userguide/model-ids.html[Amazon Bedrock base models].

Conversation

markjhoy commented Jul 8, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Jul 8, 2024

Uh oh!

timgrein left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

markjhoy commented Jul 8, 2024

Uh oh!

elasticsearchmachine commented Jul 8, 2024

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

leemthompo left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

leemthompo Jul 11, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

leemthompo left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

elasticsearchmachine commented Jul 12, 2024

💚 Backport successful

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

markjhoy commented Jul 8, 2024 •

edited

Loading

leemthompo Jul 11, 2024 •

edited

Loading

leemthompo left a comment •

edited

Loading