Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 15 additions & 6 deletions docs/plans/2026-03-05-image-editing-improvements-design.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,29 +12,34 @@ Immich has a non-destructive image editing system (crop, rotate, mirror) merged
### Phase 1 — Bug Fixes

#### 1.1 Person thumbnail scalar subquery (#26045)

- **File**: `server/src/repositories/person.repository.ts:287-293`
- **Problem**: Scalar subquery on `asset_file` returns multiple rows when an asset has both original and edited preview files, crashing `PersonGenerateThumbnail`.
- **Fix**: Add `.limit(1)` to the subquery with ordering to prefer non-edited files.

#### 1.2 Download-as-album serves original (#26182)

- **File**: `server/src/repositories/download.repository.ts`
- **Problem**: `editedPath` is not selected in the download query builder, so the conditional logic in `download.service.ts:107` always falls back to `originalPath`.
- **Fix**: Add `editedPath` to the select clause in the download repository.

#### 1.3 Album cover not refreshed after edits (#25803)

- **Files**: `server/src/repositories/album.repository.ts:340-381`, asset file serving logic
- **Problem**: Album cover thumbnails don't refresh when the cover photo is edited; no logic to prefer the edited version.
- **Fix**: When serving album cover thumbnails, prefer the edited preview file when one exists.

### Phase 2 — Quick-Rotate in Viewer

#### Architecture

- **New component**: `web/src/lib/components/asset-viewer/actions/rotate-action.svelte`
- One rotate-right (90 CW) icon button in the viewer toolbar, positioned before the Edit button.
- Rotate-left (90 CCW) and rotate-180 added to the More dropdown menu.
- Only visible for owned images under the same conditions as Edit (no video, GIF, SVG, panorama, live photo).

#### Data Flow

1. User clicks rotate button
2. Client calls `getAssetEdits(id)` to read current edits
3. Append `{ action: 'rotate', angle: 90 }` to the edit list
Expand All @@ -44,6 +49,7 @@ Immich has a non-destructive image editing system (crop, rotate, mirror) merged
7. Show loading spinner on button during processing

#### UX Details

- Loading spinner replaces icon during processing
- Success: image refreshes in place, no toast needed
- Error: error toast displayed
Expand All @@ -52,18 +58,21 @@ Immich has a non-destructive image editing system (crop, rotate, mirror) merged
### Phase 3 — Batch Rotate in Timeline

#### Architecture

- **New component**: `web/src/lib/components/timeline/actions/rotate-action.svelte`
- Appears in the multi-select toolbar when images are selected (follows FavoriteAction/ArchiveAction pattern).
- Dropdown with rotate-left, rotate-right, rotate-180 options.

#### Implementation

- Client-side loop iterating selected assets.
- For each asset: read existing edits, append rotation, write back.
- Filters out non-image and non-owned assets before processing.
- Progress feedback via toast ("Rotating 15/200...").
- Continues on individual failures, reports summary at end.

### Out of Scope

- Server-side batch endpoint (future optimization for Phase 3)
- Color adjustments (brightness, contrast, warmth)
- Mobile parity (Flutter editor migration)
Expand All @@ -72,9 +81,9 @@ Immich has a non-destructive image editing system (crop, rotate, mirror) merged

## Key Design Decisions

| Decision | Choice | Rationale |
|----------|--------|-----------|
| Quick-rotate architecture | Read-then-write pattern (client reads edits, appends, writes back) | Simple, no new server endpoints, correctly composes with existing edits |
| Toolbar placement | Single rotate-right in toolbar + both directions in More menu | Balance of discoverability and toolbar cleanliness |
| Batch implementation | Client-side loop | Matches existing batch action patterns, ships faster, optimize later if needed |
| Edit composition | Append to existing edits | Preserves user's crop/mirror edits when rotating |
| Decision | Choice | Rationale |
| ------------------------- | ------------------------------------------------------------------ | ------------------------------------------------------------------------------ |
| Quick-rotate architecture | Read-then-write pattern (client reads edits, appends, writes back) | Simple, no new server endpoints, correctly composes with existing edits |
| Toolbar placement | Single rotate-right in toolbar + both directions in More menu | Balance of discoverability and toolbar cleanliness |
| Batch implementation | Client-side loop | Matches existing batch action patterns, ships faster, optimize later if needed |
| Edit composition | Append to existing edits | Preserves user's crop/mirror edits when rotating |
14 changes: 11 additions & 3 deletions docs/plans/2026-03-05-image-editing-improvements-plan.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@
### Task 1: Fix person thumbnail scalar subquery (#26045)

**Files:**

- Modify: `server/src/repositories/person.repository.ts:285-293`
- Test: `server/src/services/media.service.spec.ts` (existing tests + new test)

Expand Down Expand Up @@ -77,6 +78,7 @@ git commit -m "fix(server): add limit to person thumbnail preview subquery (#260
### Task 2: Fix download-as-album serving original instead of edited (#26182)

**Files:**

- Modify: `server/src/repositories/asset.repository.ts:1086-1106`
- Test: `server/src/services/download.service.spec.ts`

Expand All @@ -98,9 +100,7 @@ it('should use edited path when edited flag is true and editedPath exists', asyn
mocks.asset.getForOriginals.mockResolvedValue([editedAsset]);
mocks.storage.createZipStream.mockReturnValue(archiveMock);

await expect(
sut.downloadArchive(authStub.admin, { assetIds: [asset.id], edited: true }),
).resolves.toEqual({
await expect(sut.downloadArchive(authStub.admin, { assetIds: [asset.id], edited: true })).resolves.toEqual({
stream: archiveMock.stream,
});

Expand Down Expand Up @@ -142,6 +142,7 @@ git commit -m "fix(server): serve edited files in download-as-album (#26182)"
### Task 3: Fix album thumbnail not using edited version (#25803)

**Files:**

- Modify: `server/src/repositories/asset-job.repository.ts:163-171`
- Test: `server/src/services/notification.service.spec.ts`

Expand Down Expand Up @@ -191,6 +192,7 @@ git commit -m "fix(server): prefer edited album thumbnail file (#25803)"
### Task 4: Add i18n keys for quick-rotate

**Files:**

- Modify: `i18n/en.json`

**Step 1: Add new i18n keys**
Expand All @@ -217,6 +219,7 @@ git commit -m "feat(i18n): add quick-rotate translation keys"
### Task 5: Add quick-rotate ActionItem to asset service

**Files:**

- Modify: `web/src/lib/services/asset.service.ts`

**Step 1: Read the full file**
Expand All @@ -228,6 +231,7 @@ Read `web/src/lib/services/asset.service.ts` to understand the existing action p
Add `RotateRight` action next to the `Edit` action. Import `mdiRotateRight` from `@mdi/js`. Import `getAssetEdits`, `editAsset`, `AssetEditAction` from `@immich/sdk`. Import `waitForWebsocketEvent` from `$lib/stores/websocket`. Import `eventManager` from `$lib/managers/event-manager.svelte`. Import `toastManager` from `@immich/ui`.

The RotateRight action should:

- Have the same `$if` condition as `Edit` (owner, image, not live photo, not panorama, not GIF, not SVG)
- Use `mdiRotateRight` icon
- Use title `$t('quick_rotate_right')`
Expand Down Expand Up @@ -263,6 +267,7 @@ git commit -m "feat(web): add quick-rotate action items to asset service"
### Task 6: Add rotate button to viewer navbar

**Files:**

- Modify: `web/src/lib/components/asset-viewer/asset-viewer-nav-bar.svelte`

**Step 1: Add rotate-right button to the toolbar**
Expand Down Expand Up @@ -302,6 +307,7 @@ git commit -m "feat(web): add rotate button to viewer toolbar and More menu"
### Task 7: Handle asset refresh after quick-rotate

**Files:**

- Check: `web/src/lib/components/asset-viewer/asset-viewer.svelte`

**Step 1: Verify the refresh mechanism**
Expand All @@ -325,6 +331,7 @@ Only commit if modifications were required.
### Task 8: Add batch rotate timeline action component

**Files:**

- Create: `web/src/lib/components/timeline/actions/RotateAction.svelte`

**Step 1: Create the component**
Expand Down Expand Up @@ -420,6 +427,7 @@ git commit -m "feat(web): add batch rotate action component for timeline"
### Task 9: Wire batch rotate into timeline toolbar

**Files:**

- Modify: The component that renders the timeline multi-select toolbar

**Step 1: Find where timeline actions are rendered**
Expand Down
30 changes: 25 additions & 5 deletions docs/plans/2026-03-05-pet-detection-design.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@
Add pet detection and individual pet recognition to Immich, allowing users to find and organize photos of their pets the same way they do with people.

**Phased approach:**

- **Phase 1** — Pet Detection (YOLOv8): detect animals in photos with bounding boxes and species labels
- **Phase 2** — Pet Recognition (MegaDescriptor): generate embeddings for detected pets, cluster them into named individuals shown alongside people

Expand All @@ -15,6 +16,7 @@ Backfill: new uploads processed automatically when enabled. Existing assets requ
## Community Context

636 upvotes on [Discussion #7151](https://github.com/immich-app/immich/discussions/7151) — the 3rd most requested feature. Users want:

- Find all photos of a specific pet by name (like Google Photos / Apple Photos)
- Pets shown alongside people in the UI
- Support beyond cats/dogs (horses, birds, etc.)
Expand All @@ -25,6 +27,7 @@ Backfill: new uploads processed automatically when enabled. Existing assets requ
### New Model Task

Add to `machine-learning/immich_ml/schemas.py`:

```python
class ModelTask(StrEnum):
PET_DETECTION = "pet-detection"
Expand All @@ -33,13 +36,15 @@ class ModelTask(StrEnum):
### New Model Classes

**PetDetector** (`machine-learning/immich_ml/models/pet_detection/detection.py`):

- YOLOv8n (nano, ~6MB ONNX) or YOLOv8s (small, ~22MB ONNX)
- Input: image
- Output: list of `{ boundingBox, species, confidence }`
- Filters to the 10 COCO animal classes: cat, dog, bird, horse, sheep, cow, elephant, bear, zebra, giraffe
- Model hosted on HuggingFace under `immich-app/` org

**PetRecognizer** (`machine-learning/immich_ml/models/pet_detection/recognition.py`) (Phase 2):

- MegaDescriptor (`BVRA/MegaDescriptor-L-384`) exported to ONNX
- Input: cropped pet region from detector
- Output: embedding vector
Expand All @@ -48,6 +53,7 @@ class ModelTask(StrEnum):
### Registration

Add both models to:

- `models/__init__.py:get_model_class()` — model class routing
- `models/constants.py` — model name validation

Expand All @@ -74,6 +80,7 @@ petDetection: {
### Jobs & Queue

Add to `server/src/enum.ts`:

- `JobName.PetDetection` — process a single asset
- `JobName.PetDetectionQueueAll` — queue all assets for backfill
- `QueueName.PetDetection`
Expand All @@ -87,6 +94,7 @@ In `server/src/services/job.service.ts`, hook `PetDetection` into the post-thumb
New file: `server/src/services/pet-detection.service.ts`

**`handlePetDetection(assetId)`:**

1. Call ML service with detection model (+ recognition model for allowed species)
2. For each detected animal: store bounding box in `asset_face` table with `sourceType = 'MACHINE_LEARNING'`
3. For allowed species (Phase 2): store embedding in `face_search`, cluster to match/create person entries with `type = 'pet'`
Expand All @@ -98,47 +106,58 @@ Query all assets where `petsDetectedAt IS NULL`, queue a `PetDetection` job for
### ML Repository

Add to `server/src/repositories/machine-learning.repository.ts`:

- `detectPets(imagePath, config)` — constructs FormData request to ML service

## Database Schema Changes

Reuse existing tables — no new tables needed.

### `person` table — add columns:

- `type: varchar` — `'person'` (default) or `'pet'`
- `species: varchar | null` — `'cat'`, `'dog'`, `'bird'`, `'horse'`, etc. Null for persons.

### `asset_job_status` table — add column:

- `petsDetectedAt: timestamp | null`

### `asset_face` table — no changes

Pet detections stored as face records with bounding boxes, linked to person entries with `type = 'pet'`.

### `face_search` table — no changes

Pet embeddings stored exactly like face embeddings.

### Migration

One migration adding 3 nullable columns with defaults. Existing data unaffected. Adding nullable columns in PostgreSQL is instant (no table rewrite).

## Frontend Changes (SvelteKit)

### People Page

- Add species icon/badge (paw icon) on person cards where `type = 'pet'`
- Pets interleaved with people (no separate section)

### Person Detail Page

- Show species label (e.g. "Cat", "Dog") below pet name
- All existing functionality (photo grid, merge, rename) works unchanged

### Admin Settings

New "Pet Detection" section under Machine Learning:

- Enable/disable toggle
- Model name field
- Min confidence score slider
- Allowed species multi-select (default: cat, dog)
- "Re-process all" button for backfill

### No Changes Needed

Search, timeline, albums, sharing — pets are person entities and flow through everything automatically.

## End-to-End Data Flow
Expand Down Expand Up @@ -169,6 +188,7 @@ Set asset_job_status.petsDetectedAt = now()
```

**Backfill:**

```
Admin clicks "Re-process all"
|
Expand All @@ -181,11 +201,11 @@ Queue PetDetection job for each

## Model Options Summary

| Model | Purpose | Size | Source |
|-------|---------|------|--------|
| YOLOv8n | Pet detection (bounding boxes) | ~6MB ONNX | Ultralytics, export + host on HF |
| YOLOv8s | Pet detection (higher accuracy) | ~22MB ONNX | Ultralytics, export + host on HF |
| MegaDescriptor-L-384 | Pet re-identification embeddings | ~330MB ONNX | BVRA on HuggingFace |
| Model | Purpose | Size | Source |
| -------------------- | -------------------------------- | ----------- | -------------------------------- |
| YOLOv8n | Pet detection (bounding boxes) | ~6MB ONNX | Ultralytics, export + host on HF |
| YOLOv8s | Pet detection (higher accuracy) | ~22MB ONNX | Ultralytics, export + host on HF |
| MegaDescriptor-L-384 | Pet re-identification embeddings | ~330MB ONNX | BVRA on HuggingFace |

## References

Expand Down
Loading
Loading