An intelligent, multi-modal crop analysis and search system built using advanced Vision-Language Models (VLMs) and semantic search techniques.
This tool can detect crops from images, analyze their characteristics, and search similar crop types using natural language or voice.
- Upload photo → Get crop analysis and description
- “Show me crops with red flowers and green stems” → Voice or text → View matching images
- Automatically identifies crops from images using Qwen 2.5 Vision, a powerful multimodal large language model.
- Search for similar crops using:
- Text-based queries (e.g., “green leafy crop with wide leaves”)
- Powered by CLIP-like embeddings and cosine similarity search using ClickHouse
- Allows users to speak their search queries naturally instead of typing
- Voice is recorded using
sounddevice
and saved viascipy.io.wavfile
- Audio is transcribed using OpenAI's Whisper via the
whisper
Python package - Transcribed text is passed to the semantic search engine for matching crop images
Component | Technology |
---|---|
Vision-Language Model | Qwen 2.5 Vision |
Local Model Serving | Ollama (self-hosted model runner) |
Text-to-Speech / STT | Whisper |
Embeddings | CLIP-style vectors |
Database | ClickHouse (vector search + metadata) |
Storage | local FS (for image storage) |
Backend Logic | Python |
Explore the docs »
Report Bug
·
Request Feature
Service | Badge | Badge | Badge | Badge | Badge |
---|---|---|---|---|---|
GitHub | |||||
GitHub |
Table of Contents
The system is designed to:
- Analyze crop images for health, growth stage, field characteristics, and environmental conditions
- Generate comprehensive text descriptions for semantic search
- Create embeddings for similarity search using both text and image features
- Store everything in ClickHouse for efficient querying
This Python script allows you to identify crops in an image using Ollama server to run vision-enabled LLMs locally, such as llama3.2-vision
or qwen2.5vl
, without relying on the Hugging Face Transformers library or cloud-based APIs.
It sends an image and a predefined JSON-format prompt to a selected vision model running locally via Ollama, and returns structured information about the crop detected in the image.
By default, it uses a basic prompt, but more detailed prompts (e.g., for disease detection or richer output) can be saved as .txt
files inside the assets/
directory. You can create multiple prompt types such as:
- basic_prompt.txt
- detailed_prompt.txt
- multi_crop_prompt.txt
- etc.
These prompts are dynamically loaded and sent to the model, allowing customization without modifying code.
Identify the crop in this image and respond ONLY in the following JSON format:
{
"crop": "<primary crop name>",
"alternate_names": ["<alternate name 1>", "<alternate name 2>"],
"color": ["<color 1>", "<color 2>"],
"confidence": <confidence score from 0 to 1>
}
If any field is not known, return an empty list or null value as appropriate. Do not include any other text.
While the script has been briefly tested with qwen2.5vl:latest
and llama3.2-vision:latest
, qwen2.5vl:latest
is recommended based on local testing due to:
- Reasonable inference times
- Reliable structured JSON responses
- Decent resource usage on a typical commodity laptop
- Uses models like
llama3.2-vision
andqwen2.5vl
via the Ollama API - Accepts a local image and outputs structured JSON including:
- Crop name
- Alternate crop names
- Color details
- Confidence score
- Metadata like inference time
The result is a structured JSON response, like:
Crop Detection
{
"crop": "Sugarcane",
"alternate_names": [
"Sugar cane",
"Cane"
],
"color": [
"Green",
"Brown"
],
"confidence": 0.95,
"metadata": {
"startDateTime": "2025-06-07T20:58:35.196729",
"endDateTime": "2025-06-07T21:00:36.916434",
"duration": 121.72
}
}
Crop Analysis
{
"crop": "Sugarcane",
"alternate_names": [
"Sugar cane",
"Saccharum officinarum"
],
"color": [
"green",
"brown"
],
"confidence": 0.95,
"overall_description": "The image shows a field of sugarcane with tall, green stalks growing in rows. The field appears to be in a vegetative growth stage, with no visible signs of flowering or fruiting. The soil is visible and appears to be well-tended, indicating a managed agricultural setting.",
"growth_stage": {
"stage": "vegetative",
"estimated_age_months": 6,
"description": "The sugarcane plants are tall and have a uniform height, indicating they are in the vegetative stage of growth. The presence of young leaves suggests they are not yet mature enough to flower or bear fruit."
},
"health_assessment": {
"overall_health": "good",
"vigor_score": 0.85,
"disease_indicators": [
"empty list"
],
"pest_indicators": [
"empty list"
],
"stress_indicators": [
"none_detected"
],
"health_description": "The sugarcane plants appear healthy with no visible signs of disease or pest damage. The leaves are green and there are no signs of yellowing or wilting, indicating good vigor and health."
},
"field_characteristics": {
"planting_pattern": "rows",
"plant_density": "medium",
"field_size_estimate": "medium_field",
"crop_uniformity": "uniform",
"weed_presence": "none",
"field_description": "The sugarcane is planted in neat rows, with a consistent spacing between plants. The field appears to be well-maintained, with no visible weeds or other vegetation competing for resources."
},
"environmental_context": {
"setting": "rural",
"terrain": "flat",
"surrounding_vegetation": "trees",
"infrastructure_visible": [
"irrigation"
],
"weather_conditions": "clear",
"environment_description": "The field is located in a rural area with a flat terrain and surrounded by trees. There is evidence of irrigation infrastructure, suggesting the field is well-supplied with water. The weather appears clear, indicating favorable growing conditions."
},
"growing_conditions": {
"moisture_level": "adequate",
"soil_visibility": "clearly_visible",
"irrigation_evidence": "irrigation",
"season_indication": "growing_season",
"conditions_description": "The soil is clearly visible and appears to be well-moistened, indicating adequate irrigation. The growing conditions suggest it is the growing season, with no signs of drought or waterlogging."
},
"agricultural_insights": {
"farming_type": "commercial",
"management_quality": "good",
"harvest_readiness": "not_ready",
"estimated_months_to_harvest": null,
"management_description": "The sugarcane field is managed with a focus on irrigation, as evidenced by the visible infrastructure. The uniform planting and healthy appearance suggest a good level of management. The field is not yet ready for harvest, as the plants are still in the vegetative stage."
},
"recommendations": [
"Continue with current irrigation practices to ensure adequate moisture levels.",
"Monitor the field for any signs of pests or diseases and take preventive measures if necessary.",
"Prepare the field for harvest when the sugarcane reaches the mature stage."
],
"recommendations_summary": "The sugarcane field is in good health and well-managed, with adequate irrigation and uniform planting. The field is not yet ready for harvest, and continued monitoring and irrigation practices are recommended to ensure optimal growth and yield.",
"image_metadata": {
"image_quality": "good",
"lighting_conditions": "natural_daylight",
"viewing_angle": "ground_level",
"coverage_area": "field_overview",
"visual_description": "The image provides a clear overview of the sugarcane field, showing the rows of plants and the surrounding environment."
},
"semantic_tags": [
"sugarcane",
"vegetative_stage",
"agricultural_management",
"irrigation",
"rural_setting"
],
"search_context": "Sugarcane field in vegetative stage, good health, irrigation managed, rural setting, clear weather",
"metadata": {
"startDateTime": "2025-06-08T20:44:27.957857",
"endDateTime": "2025-06-08T20:50:47.604600",
"duration": 379.65
},
"text_description": "The image shows a Sugarcane crop with colors green, brown. It is in the vegetative stage and approximately 6 months old. Overall health is good, with stress indicators such as none_detected. The field is located in a rural area with flat terrain. Irrigation type is irrigation, and it's currently the growing_season."
}
Component | Primary Files | Testing Requirements | Expertise Needed |
---|---|---|---|
Image Analysis | main.py , image_utils.py |
VLM model validation, JSON output parsing | Python, AI/ML, Computer Vision |
Search Systems | CropSemanticSearch.py , speech_input.py |
Vector similarity testing, audio processing | Python, NLP, Speech Processing |
Database Integration | clickhouse_client.py , schema files |
Database connectivity, embedding storage | Python, ClickHouse, Vector Databases |
AI Model Integration | ollama_client.py , config.py |
Model inference testing, prompt validation | Python, LLM Integration, API Design |
Configuration | .env , assets/prompts/ |
Environment setup, template parsing | DevOps, Configuration Management |
Contributions are what make the open source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated.
Kindly refer to CONTRIBUTING.md for important Pull Request Process details
-
In the top-right corner of this page, click Fork.
-
Clone a copy of your fork on your local, replacing YOUR-USERNAME with your GitHub username.
git clone https://github.com/YOUR-USERNAME/LLM-Vision-Capabilities.git
-
Create a branch:
git checkout -b <my-new-feature-or-fix>
-
Make necessary changes and commit those changes:
git add .
git commit -m "new feature or fix"
-
Push changes, replacing
<add-your-branch-name>
with the name of the branch you created earlier at step #3. :git push origin <add-your-branch-name>
-
Submit your changes for review. Go to your repository on GitHub, you'll see a Compare & pull request button. Click on that button. Now submit the pull request.
That's it! Soon I'll be merging your changes into the master branch of this project. You will get a notification email once the changes have been merged. Thank you for your contribution.
Kindly follow Conventional Commits to create an explicit commit history. Kindly prefix the commit message with one of the following type's.
build : Changes that affect the build system or external dependencies (example scopes: gulp, broccoli, npm)
ci : Changes to our CI configuration files and scripts (example scopes: Travis, Circle, BrowserStack, SauceLabs)
docs : Documentation only changes
feat : A new feature
fix : A bug fix
perf : A code change that improves performance
refactor: A code change that neither fixes a bug nor adds a feature
style : Changes that do not affect the meaning of the code (white-space, formatting, missing semicolons, etc.)
test : Adding missing tests or correcting existing tests
This Project uses GitHub's integrated issue tracking system to record bugs and feature requests. If you want to raise an issue, please follow the recommendations below:
- Before you log a bug, please search the issue tracker to see if someone has already reported the problem.
- If the issue doesn't already exist, create a new issue
- Please provide as much information as possible with the issue report.
- If you need to paste code, or include a stack trace use Markdown +++```+++ escapes before and after your text.
Distributed under the MIT License. See LICENSE.md for more information.
- GitHub Issues: Primary channel for bug reports and feature requests
- Pull Request Discussions: Technical discussions during code review
- Email Contact: For code of conduct violations or sensitive issues: Anantha Raju C - @anantharajuc - [email protected]