VideoFinder is an intelligent video analysis tool that leverages multimodal AI models to detect and locate specific objects or people in videos. Built with FastAPI and integrated with the Llama Vision model, it provides a user-friendly web interface for video analysis tasks.
- Upload and analyze videos through an intuitive web interface
- Real-time frame-by-frame analysis using multimodal AI
- Natural language object description support
- Visual results display with confidence scores
- Image preprocessing for better detection accuracy
- Streaming response for real-time analysis feedback
- Python 3.8+
- Ollama with Llama Vision model installed
- OpenCV
- Clone the repository
git clone https://github.com/win4r/VideoFinder-Llama3.2-vision-Ollama.git
cd VideoFinder
- Install dependencies
pip install -r requirements.txt
- Make sure Ollama is running with Llama Vision model
ollama run llama3.2-vision
- Start the application
python main.py
- Access the web interface at
http://localhost:8000
- Open the web interface
- Upload a video file
- Enter a description of the object/person you want to find
- Click "Start Analysis"
- View results as they appear in real-time
- FastAPI
- OpenCV
- Ollama
- Jinja2
- uvicorn
VideoFinder 是一个智能视频分析工具,利用多模态AI模型来检测和定位视频中的特定物体或人物。该工具基于 FastAPI 构建,集成了 Llama Vision 模型,提供了友好的 Web 界面进行视频分析任务。
- 通过直观的网页界面上传和分析视频
- 使用多模态 AI 进行实时逐帧分析
- 支持自然语言目标描述
- 可视化结果显示与置信度评分
- 图像预处理以提高检测准确率
- 流式响应实现实时分析反馈
- Python 3.8+
- 安装了 Llama Vision 模型的 Ollama
- OpenCV
- 克隆仓库
git clone https://github.com/win4r/VideoFinder-Llama3.2-vision-Ollama.git
cd VideoFinder
- 安装依赖
pip install -r requirements.txt
- 确保 Ollama 已运行并加载 Llama Vision 模型
ollama run llama3.2-vision
- 启动应用
python main.py
- 访问
http://localhost:8000
打开 Web 界面
- 打开 Web 界面
- 上传视频文件
- 输入要查找的目标描述
- 点击"开始分析"
- 实时查看分析结果
- FastAPI
- OpenCV
- Ollama
- Jinja2
- uvicorn