🌾📸🎙🧠 Voice-Enabled Semantic Crop Intelligence

Search, identify, and explore crops using images, voice, and natural language.

An intelligent, multi-modal crop analysis and search system built using advanced Vision-Language Models (VLMs) and semantic search techniques.

This tool can detect crops from images, analyze their characteristics, and search similar crop types using natural language or voice.

🧪 Example Use Cases

Upload photo → Get crop analysis and description
“Show me crops with red flowers and green stems” → Voice or text → View matching images

🚀 Features

🌿 Crop Detection & Analysis

Automatically identifies crops from images using Qwen 2.5 Vision, a powerful multimodal large language model.

🔍 Semantic Search

Search for similar crops using:
- Text-based queries (e.g., “green leafy crop with wide leaves”)
- Powered by CLIP-like embeddings and cosine similarity search using ClickHouse

🎙️ Voice-Based Querying

Allows users to speak their search queries naturally instead of typing
Voice is recorded using sounddevice and saved via scipy.io.wavfile
Audio is transcribed using OpenAI's Whisper via the whisper Python package
Transcribed text is passed to the semantic search engine for matching crop images

🧠 Technology Stack

Component	Technology
Vision-Language Model	Qwen 2.5 Vision
Local Model Serving	Ollama (self-hosted model runner)
Text-to-Speech / STT	Whisper
Embeddings	CLIP-style vectors
Database	ClickHouse (vector search + metadata)
Storage	local FS (for image storage)
Backend Logic	Python

_{Built with ❤︎ by Anantha Raju C and contributors}

Explore the docs »

Report Bug · Request Feature

Service	Badge	Badge	Badge	Badge	Badge
GitHub
GitHub

Table of Contents

About The Project
Model Recommendation
Smple Output
Contributing
License
Contact

LLM-Vision-Capabilities

The system is designed to:

Analyze crop images for health, growth stage, field characteristics, and environmental conditions
Generate comprehensive text descriptions for semantic search
Create embeddings for similarity search using both text and image features
Store everything in ClickHouse for efficient querying

This Python script allows you to identify crops in an image using Ollama server to run vision-enabled LLMs locally, such as llama3.2-vision or qwen2.5vl, without relying on the Hugging Face Transformers library or cloud-based APIs.

It sends an image and a predefined JSON-format prompt to a selected vision model running locally via Ollama, and returns structured information about the crop detected in the image.

By default, it uses a basic prompt, but more detailed prompts (e.g., for disease detection or richer output) can be saved as .txt files inside the assets/ directory. You can create multiple prompt types such as:

basic_prompt.txt
detailed_prompt.txt
multi_crop_prompt.txt
etc.

These prompts are dynamically loaded and sent to the model, allowing customization without modifying code.

Example JSON Prompt Template

Identify the crop in this image and respond ONLY in the following JSON format:

{
  "crop": "<primary crop name>",
  "alternate_names": ["<alternate name 1>", "<alternate name 2>"],
  "color": ["<color 1>", "<color 2>"],
  "confidence": <confidence score from 0 to 1>
}

If any field is not known, return an empty list or null value as appropriate. Do not include any other text.

(back to top)

Model Recommendation

While the script has been briefly tested with qwen2.5vl:latest and llama3.2-vision:latest, qwen2.5vl:latest is recommended based on local testing due to:

Reasonable inference times
Reliable structured JSON responses
Decent resource usage on a typical commodity laptop

⚠️ Note: These observations are based on running the models locally on a standard laptop. Performance and accuracy may vary depending on your system's hardware (CPU, GPU, RAM, etc.).

(back to top)

Details

(back to top)

Features

Uses models like llama3.2-vision and qwen2.5vl via the Ollama API
Accepts a local image and outputs structured JSON including:
- Crop name
- Alternate crop names
- Color details
- Confidence score
- Metadata like inference time

(back to top)

Demo Image

(back to top)

Output

The result is a structured JSON response, like:

Crop Detection

{
  "crop": "Sugarcane",
  "alternate_names": [
    "Sugar cane",
    "Cane"
  ],
  "color": [
    "Green",
    "Brown"
  ],
  "confidence": 0.95,
  "metadata": {
    "startDateTime": "2025-06-07T20:58:35.196729",
    "endDateTime": "2025-06-07T21:00:36.916434",
    "duration": 121.72
  }
}

Crop Analysis

{
  "crop": "Sugarcane",
  "alternate_names": [
    "Sugar cane",
    "Saccharum officinarum"
  ],
  "color": [
    "green",
    "brown"
  ],
  "confidence": 0.95,
  "overall_description": "The image shows a field of sugarcane with tall, green stalks growing in rows. The field appears to be in a vegetative growth stage, with no visible signs of flowering or fruiting. The soil is visible and appears to be well-tended, indicating a managed agricultural setting.",
  "growth_stage": {
    "stage": "vegetative",
    "estimated_age_months": 6,
    "description": "The sugarcane plants are tall and have a uniform height, indicating they are in the vegetative stage of growth. The presence of young leaves suggests they are not yet mature enough to flower or bear fruit."
  },
  "health_assessment": {
    "overall_health": "good",
    "vigor_score": 0.85,
    "disease_indicators": [
      "empty list"
    ],
    "pest_indicators": [
      "empty list"
    ],
    "stress_indicators": [
      "none_detected"
    ],
    "health_description": "The sugarcane plants appear healthy with no visible signs of disease or pest damage. The leaves are green and there are no signs of yellowing or wilting, indicating good vigor and health."
  },
  "field_characteristics": {
    "planting_pattern": "rows",
    "plant_density": "medium",
    "field_size_estimate": "medium_field",
    "crop_uniformity": "uniform",
    "weed_presence": "none",
    "field_description": "The sugarcane is planted in neat rows, with a consistent spacing between plants. The field appears to be well-maintained, with no visible weeds or other vegetation competing for resources."
  },
  "environmental_context": {
    "setting": "rural",
    "terrain": "flat",
    "surrounding_vegetation": "trees",
    "infrastructure_visible": [
      "irrigation"
    ],
    "weather_conditions": "clear",
    "environment_description": "The field is located in a rural area with a flat terrain and surrounded by trees. There is evidence of irrigation infrastructure, suggesting the field is well-supplied with water. The weather appears clear, indicating favorable growing conditions."
  },
  "growing_conditions": {
    "moisture_level": "adequate",
    "soil_visibility": "clearly_visible",
    "irrigation_evidence": "irrigation",
    "season_indication": "growing_season",
    "conditions_description": "The soil is clearly visible and appears to be well-moistened, indicating adequate irrigation. The growing conditions suggest it is the growing season, with no signs of drought or waterlogging."
  },
  "agricultural_insights": {
    "farming_type": "commercial",
    "management_quality": "good",
    "harvest_readiness": "not_ready",
    "estimated_months_to_harvest": null,
    "management_description": "The sugarcane field is managed with a focus on irrigation, as evidenced by the visible infrastructure. The uniform planting and healthy appearance suggest a good level of management. The field is not yet ready for harvest, as the plants are still in the vegetative stage."
  },
  "recommendations": [
    "Continue with current irrigation practices to ensure adequate moisture levels.",
    "Monitor the field for any signs of pests or diseases and take preventive measures if necessary.",
    "Prepare the field for harvest when the sugarcane reaches the mature stage."
  ],
  "recommendations_summary": "The sugarcane field is in good health and well-managed, with adequate irrigation and uniform planting. The field is not yet ready for harvest, and continued monitoring and irrigation practices are recommended to ensure optimal growth and yield.",
  "image_metadata": {
    "image_quality": "good",
    "lighting_conditions": "natural_daylight",
    "viewing_angle": "ground_level",
    "coverage_area": "field_overview",
    "visual_description": "The image provides a clear overview of the sugarcane field, showing the rows of plants and the surrounding environment."
  },
  "semantic_tags": [
    "sugarcane",
    "vegetative_stage",
    "agricultural_management",
    "irrigation",
    "rural_setting"
  ],
  "search_context": "Sugarcane field in vegetative stage, good health, irrigation managed, rural setting, clear weather",
  "metadata": {
    "startDateTime": "2025-06-08T20:44:27.957857",
    "endDateTime": "2025-06-08T20:50:47.604600",
    "duration": 379.65
  },
  "text_description": "The image shows a Sugarcane crop with colors green, brown. It is in the vegetative stage and approximately 6 months old. Overall health is good, with stress indicators such as none_detected. The field is located in a rural area with flat terrain. Irrigation type is irrigation, and it's currently the growing_season."
}

(back to top)

Contributing

Contribution Areas

Component	Primary Files	Testing Requirements	Expertise Needed
Image Analysis	`main.py`, `image_utils.py`	VLM model validation, JSON output parsing	Python, AI/ML, Computer Vision
Search Systems	`CropSemanticSearch.py`, `speech_input.py`	Vector similarity testing, audio processing	Python, NLP, Speech Processing
Database Integration	`clickhouse_client.py`, schema files	Database connectivity, embedding storage	Python, ClickHouse, Vector Databases
AI Model Integration	`ollama_client.py`, `config.py`	Model inference testing, prompt validation	Python, LLM Integration, API Design
Configuration	`.env`, `assets/prompts/`	Environment setup, template parsing	DevOps, Configuration Management

Contributions are what make the open source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated.

Kindly refer to CONTRIBUTING.md for important Pull Request Process details

In the top-right corner of this page, click Fork.
Clone a copy of your fork on your local, replacing YOUR-USERNAME with your GitHub username.

git clone https://github.com/YOUR-USERNAME/LLM-Vision-Capabilities.git
Create a branch:

git checkout -b <my-new-feature-or-fix>
Make necessary changes and commit those changes:

git add .

git commit -m "new feature or fix"
Push changes, replacing <add-your-branch-name> with the name of the branch you created earlier at step #3. :

git push origin <add-your-branch-name>
Submit your changes for review. Go to your repository on GitHub, you'll see a Compare & pull request button. Click on that button. Now submit the pull request.

That's it! Soon I'll be merging your changes into the master branch of this project. You will get a notification email once the changes have been merged. Thank you for your contribution.

Kindly follow Conventional Commits to create an explicit commit history. Kindly prefix the commit message with one of the following type's.

build : Changes that affect the build system or external dependencies (example scopes: gulp, broccoli, npm)
ci : Changes to our CI configuration files and scripts (example scopes: Travis, Circle, BrowserStack, SauceLabs)
docs : Documentation only changes
feat : A new feature
fix : A bug fix
perf : A code change that improves performance
refactor: A code change that neither fixes a bug nor adds a feature
style : Changes that do not affect the meaning of the code (white-space, formatting, missing semicolons, etc.)
test : Adding missing tests or correcting existing tests

(back to top)

Reporting Issues/Suggest Improvements

This Project uses GitHub's integrated issue tracking system to record bugs and feature requests. If you want to raise an issue, please follow the recommendations below:

Before you log a bug, please search the issue tracker to see if someone has already reported the problem.
If the issue doesn't already exist, create a new issue
Please provide as much information as possible with the issue report.
If you need to paste code, or include a stack trace use Markdown +++```+++ escapes before and after your text.

(back to top)

License

Distributed under the MIT License. See LICENSE.md for more information.

(back to top)

Contact Channels

GitHub Issues: Primary channel for bug reports and feature requests
Pull Request Discussions: Technical discussions during code review
Email Contact: For code of conduct violations or sensitive issues: Anantha Raju C - @anantharajuc - [email protected]

(back to top)

Star History

(back to top)

Name		Name	Last commit message	Last commit date
Latest commit History 36 Commits
.github/ISSUE_TEMPLATE		.github/ISSUE_TEMPLATE
crop_detector		crop_detector
documentation		documentation
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🌾📸🎙🧠 Voice-Enabled Semantic Crop Intelligence

Search, identify, and explore crops using images, voice, and natural language.

🧪 Example Use Cases

🚀 Features

🌿 Crop Detection & Analysis

🔍 Semantic Search

🎙️ Voice-Based Querying

🧠 Technology Stack

LLM-Vision-Capabilities

Example JSON Prompt Template

Model Recommendation

Details

Features

Demo Image

Output

Contributing

Contribution Areas

Reporting Issues/Suggest Improvements

License

Contact Channels

Star History

About

Uh oh!

Releases

Packages

Languages

License

AnanthaRajuC/LLM-Vision-Capabilities

Folders and files

Latest commit

History

Repository files navigation

🌾📸🎙🧠 Voice-Enabled Semantic Crop Intelligence

Search, identify, and explore crops using images, voice, and natural language.

🧪 Example Use Cases

🚀 Features

🌿 Crop Detection & Analysis

🔍 Semantic Search

🎙️ Voice-Based Querying

🧠 Technology Stack

LLM-Vision-Capabilities

Example JSON Prompt Template

Model Recommendation

Details

Features

Demo Image

Output

Contributing

Contribution Areas

Reporting Issues/Suggest Improvements

License

Contact Channels

Star History

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages