Skip to content

Conversation

GoodZige
Copy link

Integrate COLMAP EXIF Pose Priors, GPS Alignment, Model Normalization, Geo-Alignment Export, and EXIF/Downscale Robustness Fixes

Background & Motivation

  • When training on DJI flight images with EXIF, the pipeline could not georegister outputs; we need to inject geospatial information into the workflow.
  • The existing pipeline did not use EXIF pose priors and could not leverage pose_prior_mapper or model_aligner directly.
  • Copying/downscaling sometimes stripped EXIF, leaving no priors in the database (No pose priors in database...).
  • Inconsistent COLMAP builds (GPU flags, vocab-tree backend, and subcommands) caused “unrecognized option/legacy index” issues.
  • The multi-scale ffmpeg downscaling filtergraph occasionally produced unconnected outputs and empty -map "" when --num-downscales > 0.
  • 3DGS training/exports often suffered from huge ECEF coordinate magnitudes, causing numerical instability (NaN/Inf) and failed exports.

What’s Changed

  • EXIF pose priors and alignment
    • Add pose_prior_mapper support (symmetric std + overwrite priors covariance option).
    • Optional model_aligner to align the SfM model to GPS priors; writes back into sparse/0.
    • New matching_method="spatial" (fallback to vocab_tree for toolchains that don’t support spatial; hloc branch always maps to vocab_tree).
  • Model normalization (meter-scale, centered)
    • New normalize_model step: estimate center and scale from COLMAP points, write a single-line transform file normalization_transform.txt (s qw qx qy qz tx ty tz using identity quaternion), invoke colmap model_transformer --transform_path in-place.
  • EXIF retention and downscale fixes
    • If no base transform/downscale is required, just copy files (no re-encode) to preserve EXIF.
    • If re-encoding/downscaling is needed, add -map_metadata 0 to all outputs.
    • Fix ffmpeg filtergraph and -map construction for --num-downscales > 0 (no empty maps, no unconnected outputs).
  • Vocab-tree
    • get_vocab_tree() now fetches a FAISS-compatible vocab tree (if targeting upstream, switch back to the official URL or add auto-upgrade/rebuild logic).
  • Geo-alignment export
    • After ns-export gaussian-splat, write geo_transforms.json containing:
      • dataparser_inverse.matrix (4×4)
      • normalization_inverse.matrix (4×4, plus original scale/quaternion/translation)
      • composite_train_to_ecef.matrix (4×4)
      • composite_train_to_ecef_dataparser: dataparser-style { transform (3×4), scale } to make downstream usage consistent with dataparser_transforms.json

New/Updated CLI Arguments (data processing)

  • --use-pose-prior: enable pose_prior_mapper.
  • --prior-position-std: EXIF pose prior std in meters.
  • --align-model-to-priors, --alignment-max-error: enable and configure model_aligner.
  • --matching-method spatial: use spatial_matcher (fallback if unsupported).
  • --normalize-model: enable normalization (model_transformer).
  • --normalization-center {bbox|mean}: choose center estimation (default bbox).
  • --normalization-target-diagonal and --normalization-scale (the explicit scale takes precedence).

Compatibility & Impact

  • Default behavior unchanged: if you don’t enable new flags, the original flow is used.
  • Normalization runs in-place after SfM + alignment (colmap/sparse/0); downstream readers remain unchanged.
  • The vocab tree fetch now targets a FAISS index; if upstream prefers, switch to the official URL or add auto-upgrade/rebuild.

Files Changed

  • nerfstudio/process_data/colmap_utils.py
    • Extend run_colmap(...) with spatial matching, pose_prior_mapper, model_aligner, and normalization (write normalization_transform.txt and call model_transformer), and remove hard-coded GPU flags for broader compatibility.
  • nerfstudio/process_data/colmap_converter_to_nerfstudio_dataset.py
    • Expose and plumb new options (including matching_method with spatial, pose priors, alignment, normalization). Map spatialvocab_tree for hloc branch.
  • nerfstudio/process_data/process_data_utils.py
    • Preserve EXIF: avoid re-encoding when not needed; add -map_metadata 0 when re-encoding/downscaling.
    • Fix ffmpeg filtergraph/-map generation for --num-downscales > 0 (no empty maps, no unconnected outputs).
  • nerfstudio/utils/scripts.py
    • Harden run_command: no decode crash when stderr=None under --verbose; show clearer error messages.
  • nerfstudio/scripts/exporter.py
    • After gaussian-splat export, write geo_transforms.json with inverse and composite transforms, including a dataparser-style { transform, scale } for the composite mapping.

Outcomes

  • Leverage EXIF pose priors and GPS alignment to improve global registration.
  • Normalize to meter-scale and center to stabilize training and reduce NaN/Inf; exports succeed more reliably.
  • Preserve EXIF through copying/downscaling so pose_prior_mapper can read priors from DB.
  • Provide reusable train→ECEF transforms (4×4 and dataparser-style) for frontends/third-party consumers.

How to Run (from source)

  • Data processing (EXIF prior + alignment + normalization):
    • python -m nerfstudio.scripts.process_data images --data DATA --output-dir OUT --camera-type perspective --matching-method vocab_tree --use-pose-prior --prior-position-std 2 --align-model-to-priors --normalize-model --normalization-center bbox --normalization-target-diagonal 4.0 --colmap-cmd colmap --no-verbose
  • Training (avoid re-centering/rescaling twice):
    • python -m nerfstudio.scripts.train splatfacto --data OUT --pipeline.model.camera-optimizer.mode SO3xR3 --pipeline.model.use_scale_regularization True colmap --center-method none --auto-scale-poses False
  • Export (auto-writes geo_transforms.json):
    • python -m nerfstudio.scripts.exporter gaussian-splat --load-config RUN/config.yml --output-dir EXPORT_DIR

Frontend / 3D Engine (brief)

  • For visualization and geo-validation (e.g., GaussianSplats3D), read geo_transforms.json:
    • Use composite_train_to_ecef_dataparser and apply x_ecef = s · (R · x_train + t).
    • Convert ECEF to geodetic/ENU as needed by the client.

Known Issues & Notes

  • Vocab-tree FLANN → FAISS: recent COLMAP builds expect FAISS fbow; if your cache has a legacy FLANN tree, rebuild with vocab_tree_builder.
  • spatial_matcher availability depends on the COLMAP build; fallback is vocab_tree/exhaustive.

Validation

  • Logs show: pose_prior_mapper, model_aligner, and Done normalizing model.
  • colmap/sparse/0 contains {cameras.bin, images.bin, points3D.bin}.
  • geo_transforms.json is produced successfully.
  • Training/exports avoid NaN/Inf failures; qualitative improvements after GPS alignment.

@GoodZige GoodZige marked this pull request as ready for review August 29, 2025 03:08
@GoodZige
Copy link
Author

The results of geospatial alignment and video fusion are as follows.
image

@GoodZige
Copy link
Author

GoodZige commented Sep 8, 2025

maptalks result:
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant