Skip to content

Commit

Permalink
Added integration tests for meta (#274)
Browse files Browse the repository at this point in the history
Added integration tests for `ChatBedrockConverse` for Meta provider. 

2 tests are primarily failing here. 
1. `test_tool_calling`, which is consistently returning a quoted string
input value instead of an expected integer e.g., {input: '3'} instead of
{input: 3}, thus failing the test. This needs further investigation to
determine whether this is a bug with the Bedrock service or an expected
behavior with Llama models when a numeric input is required.
2. `test_tool_message_histories_list_content`, which is failing for
several other providers as well, as Bedrock doesn't seem to allow
conversation block and tool use blocks in the same turn.
        ```python
botocore.errorfactory.ValidationException: An error occurred
(ValidationException) when calling the Converse operation:
messages.1.content: Conversation blocks and tool use blocks cannot be
provided in the same turn.
        ```
  • Loading branch information
3coins authored Nov 8, 2024
1 parent b539f3d commit d25b78e
Showing 1 changed file with 38 additions and 0 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -70,6 +70,44 @@ def test_tool_calling_with_no_arguments(self, model: BaseChatModel) -> None:
pass


class TestBedrockMetaStandard(ChatModelIntegrationTests):
@property
def chat_model_class(self) -> Type[BaseChatModel]:
return ChatBedrockConverse

@property
def chat_model_params(self) -> dict:
return {"model": "us.meta.llama3-2-90b-instruct-v1:0"}

@property
def standard_chat_model_params(self) -> dict:
return {"temperature": 0.1, "max_tokens": 100, "stop": []}

@pytest.mark.xfail(reason="Meta models don't support tool_choice.")
def test_structured_few_shot_examples(self, model: BaseChatModel) -> None:
pass

# TODO: This needs investigation, if this is a bug with Bedrock or Llama models,
# but this test consistently seem to return single quoted input values {input: '3'}
# instead of {input: 3} failing the test. Upon checking with tools with non-numeric
# inputs, tool calling seems to work as expected with Bedrock and Llama models.
@pytest.mark.xfail(
reason="Bedrock Meta models tend to return string values for integer inputs ."
)
def test_tool_calling(self, model: BaseChatModel) -> None:
super().test_tool_calling(model)

@pytest.mark.xfail(reason="Meta models don't support tool_choice.")
def test_tool_calling_with_no_arguments(self, model: BaseChatModel) -> None:
pass

@pytest.mark.xfail(
reason="Human messages following AI messages not supported by Bedrock."
)
def test_tool_message_histories_list_content(self, model: BaseChatModel) -> None:
super().test_tool_message_histories_list_content(model)


def test_structured_output_snake_case() -> None:
model = ChatBedrockConverse(
model="anthropic.claude-3-sonnet-20240229-v1:0", temperature=0
Expand Down

0 comments on commit d25b78e

Please sign in to comment.