Skip to content

feat(mobile): Implement adaptive parallel upload pipeline with intelligent throttling#24934

Closed
kellyGrillo wants to merge 3 commits intoimmich-app:mainfrom
kellyGrillo:main
Closed

feat(mobile): Implement adaptive parallel upload pipeline with intelligent throttling#24934
kellyGrillo wants to merge 3 commits intoimmich-app:mainfrom
kellyGrillo:main

Conversation

@kellyGrillo
Copy link

@kellyGrillo kellyGrillo commented Dec 31, 2025

Description

This PR implements a major overhaul of the mobile backup system with an adaptive parallel upload pipeline that provides faster, more reliable, and user-controllable uploads with real-time feedback.

Key Features

🚀 Parallel Upload Pipeline

  • Concurrent hashing and uploading (no more waiting for all files to hash first)
  • Immediate uploads as batches complete
  • Configurable batch sizes (default: 20 files)

⚡ Adaptive Throttling

  • Dynamic adjustment based on success/failure rates
  • Recovery mode on errors, acceleration when stable
  • Presets: Aggressive, Balanced, Conservative

🌐 Network-Aware Large File Handling

  • Auto-detects local vs external (Cloudflare) networks
  • Skips large files (>50MB) on external networks
  • Auto-uploads large files when back on local network

📱 New Adaptive Upload UI Card

  • Real-time stats: batch size, delay, speed (MB/s)
  • Interactive sliders for manual tuning
  • Visual status indicators

🐛 Bug Fixes

  • Fixed negative remainder count
  • Fixed queue flooding
  • Failed uploads no longer block healthy files
  • Proper duplicate handling (HTTP 200/409)

Testing

  • Tested with 8000+ photos including large MP4 files
  • Verified on local network with familyvault.local
  • Cloud photo handling (Samsung/iCloud) works but inherently slow

Screenshots

image

Breaking Changes

None - all changes are additive and backward compatible.

Kelly Grillo added 3 commits December 26, 2025 13:19
Implement a self-tuning backup system that automatically adjusts batch sizes
and timing based on real-time performance metrics. This prevents the app from
bogging down when backing up large photo libraries (10,000+ assets).

## Problem
When users with thousands of photos/videos initiate backup, the app would
load all assets into memory at once, causing performance issues, freezing,
and crashes around 3,000 files.

## Solution: Adaptive 'Goldilocks' Algorithm
The system continuously monitors upload performance and automatically adjusts:
- **Batch size**: 10-200 assets per batch (starts conservative: 20-50)
- **Delay between batches**: 0-5000ms (allows memory to settle)

### Algorithm Logic
- Too hot (errors/slowdowns): Reduce batch size, increase delay
- Too cold (fast/all success): Increase batch size, reduce delay
- Just right (stable): Maintain current settings

## New Files
- models/backup/backup_metrics.model.dart - Per-batch performance tracking
- models/backup/backup_checkpoint.model.dart - Resume capability after interruption
- models/backup/adaptive_state.model.dart - Throttle system state management
- services/adaptive_throttle.service.dart - Core Goldilocks algorithm
- services/backup_recovery.service.dart - Multi-level recovery manager

## Multi-Level Recovery System
- Level 1 (Soft): Clear caches, 5s pause, continue automatically
- Level 2 (Hard): Save checkpoint, clear all caches, 30s cooldown
- Level 3 (Restart): Save state, restart app, auto-resume (Android only)

## UI Enhancements
- Batch progress bar with status badges (Optimizing, Stable, Speeding up, etc.)
- Adaptive status messages showing when adjustments happen
- Hidden advanced settings in troubleshooting mode for manual override

## Technical Changes
- BackupService: Added backupAssetAdaptive() with batched processing
- BackupProvider: Integrated throttle controller and recovery service
- BackupState: Added adaptiveState, currentBatchNumber, totalBatches
- BackgroundServicePlugin.kt: Added restartApp() for Android Level 3 recovery

## User Experience
- Completely automatic - no configuration needed for average users
- Informative progress display showing batch progress
- Advanced users can access manual controls in troubleshooting settings

Fixes performance issues when backing up large photo libraries
…igent throttling

See commit body for full details
…igent throttling

This is a major overhaul of the mobile backup system to provide faster, more
reliable, and user-controllable uploads with real-time feedback.

## Key Features

### 1. Parallel Upload Pipeline
- Replaced sequential "hash all then upload all" with concurrent hashing and uploading
- Assets are uploaded immediately as batches complete hashing
- Hash service now emits events after each batch via HashBatchResult callbacks
- Pipeline processes batches of configurable size (default: 20 files)

### 2. Adaptive Throttling System
- Dynamic batch size adjustment based on upload success/failure rates
- Automatic recovery mode when errors detected (reduces batch size)
- Acceleration mode when uploads are stable (increases throughput)
- Configurable delay between batches (100ms - 5000ms)
- Presets: Aggressive (50 batch, 100ms), Balanced (20, 500ms), Conservative (5, 2000ms)

### 3. Network-Aware Large File Handling
- Detects local vs external network (Cloudflare) connections
- Automatically skips large files (>50MB) on external networks
- Queues large files for automatic upload when local network detected
- Dynamic retry logic: small files get 3 retries, large files get 5-8 retries
- Prevents Cloudflare 100MB upload limit failures

### 4. Cloud Photo Optimization
- Prioritizes locally-available files over cloud-backed (Samsung/iCloud) photos
- Cloud files require device download before upload - now properly deferred
- Added isLocal tracking to identify cloud-backed assets
- Option to defer cloud files entirely with isLocalOnly setting

### 5. New Adaptive Upload UI Card
- Real-time display of: batch size, delay, upload speed (MB/s)
- Status indicators: Active uploads, Queued, Completed, Failed, Cloud-deferred
- Visual status: ACTIVE (green), ADJUSTING (orange), RECOVERING (red), IDLE (gray)
- Interactive sliders for manual batch size and delay adjustment
- Preset buttons for quick configuration changes
- Clickable status block navigates to upload detail view

### 6. Improved Queue Management
- Fixed queue flooding - respects configured batch size limits
- Accurate tracking: enqueueCount clamped to actual concurrent limit
- Failed/stuck uploads no longer block healthy file uploads
- Auto-removal of failed items after 10 seconds to prevent UI clutter
- Stuck upload detection and automatic cancellation

### 7. Enhanced Error Handling
- Graceful database failure handling with consecutive error tracking
- Pipeline stops after 10 consecutive DB errors with informative message
- Proper handling of HTTP 200 (duplicate), 201 (created), 409 (conflict)
- Fixed negative remainder count bug - duplicates no longer inflate counts

### 8. UI/UX Improvements
- Backup toggle: Static icon (green=on, gray=off) instead of spinning
- Removed redundant "View Details" link
- Upload speed display (KB/s or MB/s)
- Pipeline status indicators use Wrap to prevent overflow
- Chevron always clickable for upload details view
- Added "(DEV)" tag to login screen version for development builds

### 9. Performance Optimizations
- Configured FileDownloader for 10 concurrent uploads (up from default 4)
- Reduced pipeline tick interval from 300ms to 100ms for responsiveness
- State updates only when values change to reduce UI rebuilds
- Faster cancel operation using bulk task cancellation

### 10. Backup Toggle as Master Switch
- Enable Backup toggle now governs ALL upload activity
- Adaptive uploads respect toggle state
- Auto-upload of large files checks toggle before proceeding

## Files Modified

### Core Logic
- providers/backup/drift_backup.provider.dart - Main pipeline orchestration
- services/upload.service.dart - Upload task building, network detection, batching
- services/adaptive_throttle.service.dart - Throttling algorithm adjustments
- domain/services/hash.service.dart - Batch callback support for parallel hashing
- domain/utils/background_sync.dart - Hash callback integration
- repositories/upload.repository.dart - FileDownloader concurrent config

### UI Components
- pages/backup/drift_backup.page.dart - New Adaptive Upload card
- presentation/widgets/backup/backup_toggle_button.widget.dart - Static toggle
- widgets/settings/backup_settings/backup_settings.dart - Settings visibility
- widgets/settings/backup_settings/drift_backup_settings.dart - Throttle settings
- widgets/backup/current_backup_asset_info_box.dart - Status enum handling

### Models
- models/backup/adaptive_state.model.dart - Added monitoring/idle states

### Infrastructure
- infrastructure/repositories/storage.repository.dart - Local availability check

## Testing Notes
- Tested with 8000+ photos including large MP4 files
- Verified local network detection with familyvault.local
- Confirmed Cloudflare bypass for large files on local network
- Cloud photo uploads work but are inherently slow (device must download first)

## Breaking Changes
None - all changes are additive and backward compatible.

Co-authored-by: Claude (AI Assistant)
@immich-push-o-matic
Copy link

immich-push-o-matic bot commented Dec 31, 2025

Label error. Requires exactly 1 of: changelog:.*. Found: 📱mobile. A maintainer will add the required label.

@alextran1502
Copy link
Member

Thanks for the pr but I will reject this one as it is one of the most important part of the app and we are actively working on it to improve. Using AI entirely is not the way to go here

@kellyGrillo
Copy link
Author

Thanks for the pr but I will reject this one as it is one of the most important part of the app and we are actively working on it to improve. Using AI entirely is not the way to go here

Thanks Alex but it is working fantastic for me and it is the only way I was able to successfully bulk upload consistently, can you explain why AI only programming isn't the way to go? What am I missing.

@alextran1502
Copy link
Member

For this specific case, there are many things we need to understand, such as iCloud downloads, hashing, cleanup, and uploads that run in the foreground and background on iOS and Android. Each scenario has its own quirks. So using AI for this flow will likely be error-prone, work for one case, and fail in others.

We are also actively working on other PRs related to hashing and uploading. So I would like to focus on the work that we understand and are confident in

#24883
#20418

@kellyGrillo
Copy link
Author

For this specific case, there are many things we need to understand, such as iCloud downloads, hashing, cleanup, and uploads that run in the foreground and background on iOS and Android. Each scenario has its own quirks. So using AI for this flow will likely be error-prone, work for one case, and fail in others.

We are also actively working on other PRs related to hashing and uploading. So I would like to focus on the work that we understand and are confident in

#24883 #20418

Thanks Alex, appreciate the reply.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants