Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bedrock token count callbacks #20

Merged
merged 31 commits into from
May 21, 2024
Merged
Changes from 1 commit
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
7a22060
Updating _prepare_input_and_invoke to use the on_llm_end callback
NAPTlME Apr 20, 2024
929e3db
Consolidated GenerationChunk creation to _stream_response_to_generati…
NAPTlME Apr 20, 2024
9b840d1
Added an on_llm_end call to the streaming route
NAPTlME Apr 20, 2024
5802e53
Updating async streaming options to match streaming functions wrt mes…
NAPTlME Apr 22, 2024
140333b
Updating _stream_response_to_generation_chunk to get the stop reason …
NAPTlME Apr 22, 2024
2f2c8ed
Ending on "message_stop" rather that "content_block_stop" for message…
NAPTlME Apr 22, 2024
dc9c9cd
Updating _prepare_intput_and_invoke to return a packaged llm_output …
NAPTlME Apr 22, 2024
a91b3c7
Renaming to provider_stop_reason_key_map to better convey the intende…
NAPTlME Apr 22, 2024
0f345c3
_combine_generation_info_for_llm_result is now using the generation_i…
NAPTlME Apr 22, 2024
25e9caf
changing to output.get() rather than pop() to prevent the usage from …
NAPTlME Apr 22, 2024
8006cb5
Passing generation_info through the ChatGenerationChunk from _stream
NAPTlME Apr 22, 2024
1eeb60f
Getting llm_output (containing `usage` and `stop_reason` from both st…
NAPTlME Apr 22, 2024
afda911
Linting/Formatting/codespell changes
NAPTlME Apr 22, 2024
02893cc
Updating a function description
NAPTlME Apr 22, 2024
794f9e5
Line Length for Ruff
NAPTlME Apr 22, 2024
b4c5f61
Making compatible with python 3.8
NAPTlME Apr 22, 2024
6a870de
Ruff Formatting
NAPTlME Apr 23, 2024
932204a
Fixing issue with path and branch specification in lint_diff/format_diff
NAPTlME Apr 23, 2024
4750b80
Addressing linting/formatting issues
NAPTlME Apr 23, 2024
8a2465c
Typing compatibility for python < 3.10
NAPTlME Apr 25, 2024
64b3578
Merge branch 'langchain-ai:main' into bedrock-token-count-callbacks
NAPTlME Apr 29, 2024
1526f5a
Merge branch 'main' into bedrock-token-count-callbacks
NAPTlME May 3, 2024
7a7e39f
Adding a callback to test token counts on_llm_end
NAPTlME May 17, 2024
98d6d09
Adding integration test to verify token counts and stop reason are ca…
NAPTlME May 17, 2024
0f681cb
Fixing codespell errors
NAPTlME May 17, 2024
f1350c4
Merge branch 'langchain-ai:main' into bedrock-token-count-callbacks
NAPTlME May 17, 2024
8908cd8
Updating usage info for input/output tokens to use lists to contain t…
NAPTlME May 17, 2024
aa33982
Merge branch 'bedrock-token-count-callbacks' of https://github.com/NA…
NAPTlME May 17, 2024
f5e4431
Adding the usage_info transformation in which the integers are put in…
NAPTlME May 17, 2024
5eace06
Moving the usage_info nesting of token counts into lists into a funct…
NAPTlME May 17, 2024
bfa0871
Ran into an issue where the `test_chat_bedrock_streaming_generation_i…
NAPTlME May 17, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Consolidated GenerationChunk creation to _stream_response_to_generati…
…on_chunk

Also allowed additional response_body information to be passed into the generation_info (such as token counts and stop reasons)
NAPTlME committed Apr 20, 2024
commit 929e3db62e0767c45ac72649214ee224c0e4b29f
74 changes: 41 additions & 33 deletions libs/aws/langchain_aws/llms/bedrock.py
Original file line number Diff line number Diff line change
@@ -81,16 +81,44 @@ def _human_assistant_format(input_text: str) -> str:

def _stream_response_to_generation_chunk(
stream_response: Dict[str, Any],
provider,
output_key,
messages_api
) -> GenerationChunk:
"""Convert a stream response to a generation chunk."""
if not stream_response["delta"]:
return GenerationChunk(text="")
return GenerationChunk(
text=stream_response["delta"]["text"],
generation_info=dict(
finish_reason=stream_response.get("stop_reason", None),
),
)
if messages_api:
match stream_response.get("type"):
case "message_start":
usage_info = stream_response.get("message", {}).get("usage", None)
generation_info = {"usage": usage_info}
return GenerationChunk(text="", generation_info=generation_info)
case "content_block_delta":
if not stream_response["delta"]:
return GenerationChunk(text="")
return GenerationChunk(
text=stream_response["delta"]["text"],
generation_info=dict(
stop_reason=stream_response.get("stop_reason", None),
),
)
case "message_delta":
usage_info = stream_response.get("usage", None)
stop_reason = stream_response.get("stop_reason")
generation_info = {"stop_reason": stop_reason, "usage": usage_info}
return GenerationChunk(text="", generation_info=generation_info)
case _:
return None
else:
# chunk obj format varies with provider
generation_info = {k:v for k, v in stream_response.items() if k != output_key}
return GenerationChunk(
text=(
stream_response[output_key]
if provider != "mistral"
else stream_response[output_key][0]["text"]
),
generation_info=generation_info,
)


class LLMInputOutputAdapter:
@@ -223,32 +251,12 @@ def prepare_output_stream(
elif messages_api and (chunk_obj.get("type") == "content_block_stop"):
return

if messages_api and chunk_obj.get("type") in (
"message_start",
"content_block_start",
"content_block_delta",
):
if chunk_obj.get("type") == "content_block_delta":
chk = _stream_response_to_generation_chunk(chunk_obj)
yield chk
else:
continue

generation_chunk = _stream_response_to_generation_chunk(chunk_obj, provider=provider, output_key=output_key, messages_api=messages_api)
if generation_chunk:
yield generation_chunk
else:
# chunk obj format varies with provider
yield GenerationChunk(
text=(
chunk_obj[output_key]
if provider != "mistral"
else chunk_obj[output_key][0]["text"]
),
generation_info={
GUARDRAILS_BODY_KEY: (
chunk_obj.get(GUARDRAILS_BODY_KEY)
if GUARDRAILS_BODY_KEY in chunk_obj
else None
),
},
)
continue

@classmethod
async def aprepare_output_stream(