-
Notifications
You must be signed in to change notification settings - Fork 0
Add Llama 3.2 Vision inference service with CPU-only Q4_K_M quantization #12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Co-authored-by: wpowiertowski <671688+wpowiertowski@users.noreply.github.com>
Co-authored-by: wpowiertowski <671688+wpowiertowski@users.noreply.github.com>
Co-authored-by: wpowiertowski <671688+wpowiertowski@users.noreply.github.com>
Co-authored-by: wpowiertowski <671688+wpowiertowski@users.noreply.github.com>
wpowiertowski
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Schema file should be a json representing the expected response format but looks like you implemented some model logic there
- Add response_schema.json with formal JSON schema definitions - Rename schema.py to models.py to clarify it contains internal validation logic - Update README to reference the JSON schema file - Update documentation with JSON examples instead of Python type hints Co-authored-by: wpowiertowski <671688+wpowiertowski@users.noreply.github.com>
Fixed in b49c532. Created |
- Use Literal types for error_type and status fields - Ensures runtime validation matches JSON schema constraints - Update docstring to reference response_schema.json Co-authored-by: wpowiertowski <671688+wpowiertowski@users.noreply.github.com>
Implementation Plan: Llama 3.2 Vision Docker with Flask Webhook
llama-visionfor the projectLatest Changes:
Complete Implementation:
All requirements met with proper schema definition (JSON), runtime validation (Pydantic with type constraints), and comprehensive documentation.
Original prompt
💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.