Skip to content

Conversation

jamesbraza
Copy link
Collaborator

This PR:

@jamesbraza jamesbraza self-assigned this Oct 9, 2025
@jamesbraza jamesbraza added the enhancement New feature or request label Oct 9, 2025
@Copilot Copilot AI review requested due to automatic review settings October 9, 2025 00:49
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds multimodal embedding support to LMI by upstreaming utilities from paper-qa and implementing VertexAI multimodal embedding testing. The changes include new byte/string conversion utilities, image validation functions, token estimation for multimodal content, and comprehensive test coverage for multimodal embeddings.

  • Adds utility functions for handling binary data and image validation
  • Implements multimodal token estimation for rate limiting
  • Adds comprehensive testing for VertexAI multimodal embeddings with GCS integration

Reviewed Changes

Copilot reviewed 8 out of 10 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
packages/lmi/src/lmi/utils.py Added utility functions for byte/string conversion, image validation, and image encoding
packages/lmi/src/lmi/embeddings.py Enhanced token estimation to support multimodal content and updated rate limiting
packages/lmi/tests/test_utils.py Added comprehensive tests for byte/string conversion utilities
packages/lmi/tests/test_embeddings.py Added multimodal embedding tests and token estimation tests
packages/lmi/tests/conftest.py Added fixtures for PNG image testing and GCS integration
packages/lmi/pyproject.toml Added image optional dependency with pillow requirement
pyproject.toml Added mypy ignore for google.cloud.storage module
.github/workflows/tests.yml Added GCP authentication for CI testing

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Copy link
Contributor

@jabra jabra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

jamesbraza added a commit to Future-House/paper-qa that referenced this pull request Oct 9, 2025
@jamesbraza jamesbraza merged commit 180fa50 into main Oct 9, 2025
6 checks passed
@jamesbraza jamesbraza deleted the multimodal branch October 9, 2025 23:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants