The First Comprehensive SAM Survey: A Comprehensive Survey on Segment Anything Model for Vision and Beyond. Chunhui Zhang, Li Liu, Yawen Cui, Guanjie Huang, Weilin Lin, Yiqian Yang, Yuehong Hu. [paper] [homepage][中文解读]
Abstract: Artificial intelligence (AI) is evolving towards artificial general intelligence, which refers to the ability of an AI system to perform a wide range of tasks and exhibit a level of intelligence similar to that of a human being. This is in contrast to narrow or specialized AI, which is designed to perform specific tasks with a high degree of efficiency. Therefore, it is urgent to design a general class of models, which we term foundation models, trained on broad data that can be adapted to various downstream tasks. The recently proposed segment anything model (SAM) has made significant progress in breaking the boundaries of segmentation, greatly promoting the development of foundation models for computer vision. To fully comprehend SAM, we conduct a survey study. As the first to comprehensively review the progress of segmenting anything task for vision and beyond based on the foundation model of SAM, this work focuses on its applications to various tasks and data types by discussing its historical development, recent progress, and profound impact on broad applications. We first introduce the background and terminology for foundation models including SAM, as well as state-of-the-art methods contemporaneous with SAM that are significant for segmenting anything task. Then, we analyze and summarize the advantages and limitations of SAM across various image processing applications, including software scenes, real-world scenes, and complex scenes. Importantly, many insights are drawn to guide future research to develop more versatile foundation models and improve the architecture of SAM. We also summarize massive other amazing applications of SAM in vision and beyond. Finally, we maintain a continuously updated paper list and an open-source project summary for foundation model SAM at here.
Awesome Segment Anything Models: A curated list of awesome segment anything models in computer vision and beyond. This repository supplements our survey paper. We intend to continuously update it.
We strongly encourage authors of relevant works to make a pull request and add their paper's information [here].
💥SAM 2: Segment Anything in Images and Videos was released.
💥The first survey on SAM for videos: Segment Anything for Videos: A Systematic Survey was online.
- 2024.07.31: The first survey on SAM for videos was online.
- 2024.07.29: The SAM 2 was released.
- 2023.07.14: "Segment Anything" was accepted by ICCV 2023.
- 2023.05.16: An initial version of recent papers and projects.
- 2023.04.05: The paper of "Segment Anything" was online.
If you find our work useful in your research, please consider citing:
@article{zhang2023comprehensive,
title={A Comprehensive Survey on Segment Anything Model for Vision and Beyond},
author={Zhang, Chunhui and Liu, Li and Cui, Yawen and Huang, Guanjie and Lin, Weilin and Yang, Yiqian and Hu, Yuehong},
journal={arXiv preprint arXiv:2305.08196},
year={2023}
}
@article{zhang2024segment,
title={Segment Anything for Videos: A Systematic Survey},
author={Zhang, Chunhui and Cui, Yawen and Lin, Weilin and Huang, Guanjie and Rong, Yan and Liu, Li and Shan, Shiguang},
journal={arXiv preprint arXiv:2408.08315},
year={2024}
}
-
The first comprehensive SAM survey: Chunhui Zhang, Li Liu, Yawen Cui, Guanjie Huang, Weilin Lin, Yiqian Yang, Yuehong Hu.
"A Comprehensive Survey on Segment Anything Model for Vision and Beyond." ArXiv (2024). [paper] [homepage] [中文解读] [2023.05] -
SAM for Videos: Chunhui Zhang, Yawen Cui, Weilin Lin, Guanjie Huang, Yan Rong, Li Liu, Shiguang Shan.
"Segment Anything for Videos: A Systematic Survey." ArXiv (2024). [ArXiv] [ChinaXiv] [ResearchGate] [Project] [中文解读] [2024.07] -
SAM4MIS: Yichi Zhang, Rushi Jiao.
"Towards Segment Anything Model (SAM) for Medical Image Segmentation: A Survey." CBM (2024). [paper] [project] [2023.05] -
Yichi Zhang, Zhenrong Shen.
"Unleashing the Potential of SAM2 for Biomedical Images and Videos: A Survey." ArXiv (2024). [paper] [code] [2024.08] -
Tianfei Zhou, Fei Zhang, Boyu Chang, Wenguan Wang, Ye Yuan, Ender Konukoglu, Daniel Cremers.
"Image Segmentation in Foundation Model Era: A Survey." ArXiv (2024). [paper] [2024.08] -
Chaoning Zhang, Fachrina Dewi Puspitasari, Sheng Zheng, Chenghao Li, Yu Qiao, Taegoo Kang, Xinru Shan, Chenshuang Zhang, Caiyan Qin, Francois Rameau, Lik-Hang Lee, Sung-Ho Bae, Choong Seon Hong.
"A Survey on Segment Anything Model (SAM): Vision Foundation Model Meets Prompt Engineering." ArXiv (2024). [paper] [2023.05] -
Mudassar Ali and Tong Wu and Haoji Hu and Qiong Luo and Dong Xu and Weizeng Zheng and Neng Jin and Chen Yang and Jincao Yao.
"A review of the Segment Anything Model (SAM) for medical image analysis: Accomplishments and perspectives." Computerized Medical Imaging and Graphics (2024). [paper] [2024.12]
-
SAM: Alexander Kirillov, Eric Mintun, Nikhila Ravi, Hanzi Mao, Chloe Rolland, Laura Gustafson, Tete Xiao, Spencer Whitehead, Alexander C. Berg, Wan-Yen Lo, Piotr Dollár, Ross Girshick.
"Segment Anything." ICCV (2023) Best Paper Honorable Mention. [paper] [homepage] [code] [Zhihu] [Reddit] [2023.04] -
SAM 2: Nikhila Ravi∗,†, Valentin Gabeur∗, Yuan-Ting Hu∗, Ronghang Hu∗, Chaitanya Ryali∗, Tengyu Ma∗, Haitham Khedr∗, Roman Rädle∗ Chloe Rolland, Laura Gustafson, Eric Mintun, Junting Pan, Kalyan Vasudev Alwala, Nicolas Carion, Chao-Yuan Wu, Ross Girshick, Piotr Dollár†, Christoph Feichtenhofer∗,†.
"SAM 2: Segment Anything in Images and Videos." ArXiv (2024). [paper] [demo]] [code] [project]] [dataset] [blog] [2024.07] -
GPT-4V: OpenAI.
"GPT-4V(ision) System Card." ArXiv (2023). [paper] [homepage] [2023.09] -
Gemini: Gemini Team, Googl.
"Gemini: A Family of Highly Capable Multimodal Models." ArXiv (2023). [paper] [homepage] [blog] [2023.12] -
SEEM: Xueyan Zou, Jianwei Yang, Hao Zhang, Feng Li, Linjie Li, Jianfeng Gao, Yong Jae Lee.
"Segment Everything Everywhere All at Once." NeurIPS (2023). [paper] [code] [2023.04] -
SegGPT: Xinlong Wang, Xiaosong Zhang, Yue Cao, Wen Wang, Chunhua Shen, Tiejun Huang.
"SegGPT: Segmenting Everything In Context." ICCV (2023). [paper] [code] [2023.04] -
Grounding DINO: Shilong Liu, Zhaoyang Zeng, Tianhe Ren, Feng Li, Hao Zhang, Jie Yang, Chunyuan Li, Jianwei Yang, Hang Su, Jun Zhu, Lei Zhang.
"Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection." ArXiv (2023). [paper] [code] [2023.04] -
ImageBind: Rohit Girdhar, Alaaeldin El-Nouby, Zhuang Liu, Mannat Singh, Kalyan Vasudev Alwala, Armand Joulin, Ishan Misra.
"ImageBind: One Embedding Space To Bind Them All." CVPR (2023). [paper] [homepage] [code] [2023.05] -
LanguageBind: Bin Zhu, Bin Lin, Munan Ning, Yang Yan, Jiaxi Cui, HongFa Wang, Yatian Pang, Wenhao Jiang, Junwu Zhang, Zongwei Li, Wancai Zhang, Zhifeng Li, Wei Liu, Li Yuan.
"LanguageBind: Extending Video-Language Pretraining to N-modality by Language-based Semantic Alignment." ArXiv (2023). [paper] [code] -
Meta-Transformer: Yiyuan Zhang, Kaixiong Gong, Kaipeng Zhang, Hongsheng Li, Yu Qiao, Wanli Ouyang, Xiangyu Yue.
"Meta-Transformer: A Unified Framework for Multimodal Learning." ArXiv (2023). [paper] [homepage] [code] [中文解读] [2023.07] -
OpenSeeD: Hao Zhang, Feng Li, Xueyan Zou, Shilong Liu, Chunyuan Li, Jianfeng Gao, Jianwei Yang, Lei Zhang.
"A Simple Framework for Open-Vocabulary Segmentation and Detection." ICCV (2023). [paper] [code] [2023.03] -
RAM: Youcai Zhang, Xinyu Huang, Jinyu Ma, Zhaoyang Li, Zhaochuan Luo, Yanchun Xie, Yuzhuo Qin, Tong Luo, Yaqian Li, Shilong Liu, Yandong Guo, Lei Zhang.
"Recognize Anything: A Strong Image Tagging Model." ArXiv (2023). [paper] [homepage] [code] [2023.06] -
PACGen: Yuheng Li, Haotian Liu, Yangming Wen, Yong Jae Lee.
"Generate Anything Anywhere in Any Scene." ArXiv (2023). [paper] [homepage] [code] [2023.06] -
ASM: Weiyun Wang, Min Shi, Qingyun Li, Wenhai Wang, Zhenhang Huang, Linjie Xing, Zhe Chen, Hao Li, Xizhou Zhu, Zhiguo Cao, Yushi Chen, Tong Lu, Jifeng Dai, Yu Qiao.
"The All-Seeing Project: Towards Panoptic Visual Recognition and Understanding of the Open World." ArXiv (2023). [paper] [homepage] [demo] [2023.08] -
OneFormer: Jitesh Jain, Jiachen Li, MangTik Chiu, Ali Hassani, Nikita Orlov, Humphrey Shi.
"OneFormer: One Transformer to Rule Universal Image Segmentation." CVPR (2023). [paper] [homepage] [code] [2022.11] -
OVSeg: Feng Liang, Bichen Wu, Xiaoliang Dai, Kunpeng Li, Yinan Zhao, Hang Zhang, Peizhao Zhang, Peter Vajda, Diana Marculescu.
"Open-Vocabulary Semantic Segmentation with Mask-adapted CLIP." CVPR (2023). [paper] [homepage] [code] [2022.10] -
WAM: Tom Sander, Pierre Fernandez, Alain Durmus, Teddy Furon, Matthijs Douze.
"Watermark Anything with Localized Messages." ArXiv (2024). [paper] [code] [2024.11] -
Sa2VA: Haobo Yuan, Xiangtai Li, Tao Zhang, Zilong Huang, Shilin Xu, Shunping Ji, Yunhai Tong, Lu Qi, Jiashi Feng, Ming-Hsuan Yang.
"Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos." ArXiv (2025). [paper] [code] [project] [hugging face] [2025.01]
💥AHCPTQ: Wenlun Zhang, Shimpei Ando, Kentaro Yoshioka.
"AHCPTQ: Accurate and Hardware-Compatible Post-Training Quantization for Segment Anything Model." ArXiv (2025).
[paper]
[2025.03]
💥SPD-VFM: Pengchen Liang, Leijun Shi, Huiping Yao, Bin Pu, Jianguo Chen, Lei Zhao, Haishan Huang, Zhuangzhuang Chen, Zhaozhao Xu, Lite Xu, Qing Chang, Yiwei Li.
"Semantic Prior Distillation with Vision Foundation Model for Enhanced Rapid Bone Scintigraphy Image Restoration." ArXiv (2025).
[paper]
[2025.03]
💥SHIFNet : Jiayi Zhao, Fei Teng, Kai Luo, Guoqiang Zhao, Zhiyong Li, Xu Zheng, Kailun Yang.
"Unveiling the Potential of Segment Anything Model 2 for RGB-Thermal Semantic Segmentation with Language Guidance." ArXiv (2025).
[paper]
[code]
[2025.03]
💥ReID-SAM: Kunjun Li, Cheng-Yen Yang, Hsiang-Wei Huang, Jenq-Neng Hwang.
"Technical Report for ReID-SAM on SkiTB Visual Tracking Challenge 2025." ArXiv (2025).
[paper]
[2025.03]
💥Clayton Bromley, Alexander Moore, Amar Saini, Doug Poland, Carmen Carrano.
"An Analysis of Segment Anything 2." ArXiv (2025).
[paper]
[2025.03]
💥SAGE: Guanyao Wu, Haoyu Liu, Hongming Fu, Yichuan Peng, Jinyuan Liu, Xin Fan, Risheng Liu.
"Every SAM Drop Counts: Embracing Semantic Priors for Multi-Modality Image Fusion and Beyond." ArXiv (2025).
[paper]
[2025.03]
💥SparseMamba-PCL: Luyi Qiu, Tristan Till, Xiaobao Guo, Adams Wai-Kin Kong.
"SparseMamba-PCL: Scribble-Supervised Medical Image Segmentation via SAM-Guided Progressive Collaborative Learning." ArXiv (2025).
[paper]
[code]
[2025.03]
💥SemiSAM+: Yichi Zhang, Bohao Lv, Le Xue, Wenbo Zhang, Yuchen Liu, Yu Fu, Yuan Cheng, Yuan Qi.
"SemiSAM+: Rethinking Semi-Supervised Medical Image Segmentation in the Era of Foundation Models." MIA (2025).
[paper]
[2025.02]
💥Silius M. Vandeskog, Magne Aldrin, Daniel Howell, Edvin Fuglebakk.
"Adding smoothing splines to the SAM model improves stock assessment." ArXiv (2025).
[paper]
[2025.02]
💥Utku Ozbulak, Seyed Amir Mousavi, Francesca Tozzi, Nikdokht Rashidian, Wouter Willaert, Wesley De Neve, Joris Vankerschaver.
"Less is More? Revisiting the Importance of Frame Rate in Real-Time Zero-Shot Surgical Video Segmentation." ArXiv (2025).
[paper]
[2025.02]
-
BudSAM: Zhou, Chenxi and Wan, Tianjiao and Xu, Kele and Qiao, Peng and Dou, Yong.
"Segment Anything for Visual Bird Sound Denoising." IEEE SPL (2025). [paper] [code] [2025.02] -
LORENZA: Yehonathan Refael, Iftach Arbel, Ofir Lindenbaum, Tom Tirer.
"LORENZA: Enhancing Generalization in Low-Rank Gradient LLM Training and Fine-Tuning via Efficient Zeroth-Order Adaptive SAM Optimization." ArXiv (2025). [paper] [2025.02] -
HumanCLIP: Keito Suzuki, Bang Du, Girish Krishnan, Kunyao Chen, Runfa Blark Li, Truong Nguyen.
"Open-Vocabulary Semantic Part Segmentation of 3D Human." 3DV (2025). [paper] [2025.02] -
CLIP+Grad-CAM+SAM: Muhammad A. Muttaqien, Tomohiro Motoda, Ryo Hanai, Domae Yukiyasu.
"Attention-Guided Integration of CLIP and SAM for Precise Object Masking in Robotic Manipulation." 2025 IEEE/SICE International Symposium on System Integration (2025). [paper] [2025.02] -
VesselSAM: Adnan Iltaf, Rayan Merghani Ahmed, Bin Li, Shoujun Zhou.
"VesselSAM: Leveraging SAM for Aortic Vessel Segmentation with LoRA and Atrous Attention." ArXiv (2025). [paper] [code] [2025.02] -
DICEPTION: Canyu Zhao, Mingyu Liu, Huanyi Zheng, Muzhi Zhu, Zhiyue Zhao, Hao Chen, Tong He, Chunhua Shen.
"DICEPTION: A Generalist Diffusion Model for Visual Perceptual Tasks." ArXiv (2025). [paper] [code] [project] [2025.02] -
AV2T-SAM: Kyungbok Lee, You Zhang, Zhiyao Duan.
"AUDIO VISUAL SEGMENTATION THROUGH TEXT EMBEDDINGS." ArXiv (2025). [paper] [2025.02] -
LVM-MSC: Feibo Jiang, Siwei Tu, Li Dong, Kezhi Wang, Kun Yang, Ruiqi Liu, Cunhua Pan, Jiangzhou Wang.
"Lightweight Vision Model-based Multi-user Semantic Communication Systems." ArXiv (2025). [paper] [2025.02] -
USegMix: Jiamu Wang, Jin Tae Kwak.
"USegMix: Unsupervised Segment Mix for Efficient Data Augmentation in Pathology Images." ArXiv (2025). [paper] [2025.02] -
SESSRS: Qiao, Yang and Zhong, Bo and Du, Bailin and Cai, He and Jiang, Jinxiong and Liu, Qinhuo and Yang, Aixia and Wu, Junjun and Wang, Xiaoya.
"SAM Enhanced Semantic Segmentation for Remote Sensing Imagery Without Additional Training." TGRS (2025). [paper] [code] [2025.02] -
UrbanSAM: Chenyu Li, Danfeng Hong, Bing Zhang, Yuxuan Li, Gustau Camps-Valls, Xiao Xiang Zhu, Jocelyn Chanussot.
"UrbanSAM: Learning Invariance-Inspired Adapters for Segment Anything Models in Urban Construction." ArXiv (2025). [paper] [code] [2025.02] -
Ufaq Khan, Umair Nawaz, Adnan Qayyum, Shazad Ashraf, Muhammad Bilal, Junaid Qadir.
"Surgical Scene Understanding in the Era of Foundation AI Models: A Comprehensive Review." ArXiv (2025). [paper] [2025.02] -
YOLO-SAM: Tianyou Jiang, Mingshun Shao, Tianyi Zhang, Xiaoyu Liu, Qun Yu.
"Soybean pod and seed counting in both outdoor fields and indoor laboratories using unions of deep neural networks." ArXiv (2025). [paper] [2025.02] -
SIYO: Mayankeyshwar, Mridul and Kumar, Lookinder and Wagh, Mamata P. and Behuria, Swatishree and Yadav, Dev.
"Brain Tumor Detection and Segmentation using SAM integrated YOLOv9 Scheme." ASPCC (2024). [paper] [2025.02] -
FieldSeg: Lucas B. Ferreira and Vitor S. Martins and Uilson R.V. Aires and Nuwan Wijewardane and Xin Zhang and Sathish Samiappan.
"FieldSeg: A scalable agricultural field extraction framework based on the Segment Anything Model and 10-m Sentinel-2 imagery." Computers and Electronics in Agriculture (2025). [paper] [2025.02] -
GDPGO-SAM: Hua, Shuzhen, Biao Yang, Xinchang Zhang, Ji Qi, Fengxi Su, Jing Sun, and Yongjian Ruan.
"GDPGO-SAM: An Unsupervised Fine Segmentation of Desert Vegetation Driven by Grounding DINO Prompt Generation and Optimization Segment Anything Model." Remote Sensing (2025). [paper] [2025.02] -
Raphael Stock, et al.
"Segment Anything in Medical Images with nnUNet." ArXiv (2025). [paper] [2025.02] -
MedfcientSAM: Bao-Hiep Le, et al.
"MedfcientSAM: A Robust Medical Segmentation Model with Optimized Inference Pipeline for Limited Clinical Settings." ArXiv (2025). [paper] [code] [2025.02] -
SegAnyPET: Yichi Zhang, Le Xue, Wenbo Zhang, Lanlan Li, Yuchen Liu, Chen Jiang, Yuan Cheng, Yuan Qi.
"SegAnyPET: Universal Promptable Segmentation from Positron Emission Tomography Images." ArXiv (2025). [paper] [code] [2025.02] -
Pengchen Liang, Bin Pu, Haishan Huang, Yiwei Li, Hualiang Wang, Weibo Ma, Qing Chang.
"Vision Foundation Models in Medical Image Analysis: Advances and Challenges." ArXiv (2025). [paper] [2025.02] -
SASVi: Ssharvien Kumar Sivakumar, Yannik Frisch, Amin Ranem, Anirban Mukhopadhyay.
"SASVi - Segment Any Surgical Video." ArXiv (2025). [paper] [2025.02] -
SpeHeatal: Yi Shi, Yunkai Wang, Xupeng Tian, Tieyi Zhang, Bing Yao, Hui Wang, Yong Shao, Cencen Wang, Rong Zeng.
"SpeHeatal: A Cluster-Enhanced Segmentation Method for Sperm Morphology Analysis." AAAI (2025). [paper] [2025.02] -
MaizeEar-SAM: Hossein Zaremehrjerdi, Lisa Coffey, Talukder Jubery, Huyu Liu, Jon Turkus, Kyle Linders, James C. Schnable, Patrick S. Schnable, Baskar Ganapathysubramanian.
"MaizeEar-SAM: Zero-Shot Maize Ear Phenotyping." ArXiv (2025). [paper] [2025.02] -
PRISM: Kangning Cui, Rongkun Zhu, Manqi Wang, Wei Tang, Gregory D. Larsen, Victor P. Pauca, Sarra Alqahtani, Fan Yang, David Segurado, David Lutz, Jean-Michel Morel, Miles R. Silman.
"Detection and Geographic Localization of Natural Objects in the Wild: A Case Study on Palms." ArXiv (2025). [paper] [2025.02] -
SAM-Assisted-Registration: Hao Xu, Tengfei Xue, Jianan Fan, Dongnan Liu, Yuqian Chen, Fan Zhang, Carl-Fredrik Westin, Ron Kikinis, Lauren J. O'Donnell, Weidong Cai.
"Medical Image Registration Meets Vision Foundation Model: Prototype Learning and Contour Awareness." IPMI (2025). [paper] [code] [2025.02] -
WRT-SAM: Yunyi Zhou, Kun Shi, Gang Hao.
"WRT-SAM: Foundation Model-Driven Segmentation for Generalized Weld Radiographic Testing ." ArXiv (2025). [paper] [2025.02] -
MITO: Laura Dodds, Tara Boroushaki, Fadel Adib.
"MITO: Enabling Non-Line-of-Sight Perception using Millimeter-waves through Real-World Datasets and Simulation Tools." ArXiv (2025). [paper] [2025.02] -
SAM2Refiner: Yuan Yao, Qiushi Yang, Miaomiao Cui, Liefeng Bo.
"Towards Fine-grained Interactive Segmentation in Images and Videos." ArXiv (2025). [paper] [2025.02] -
SAM-QA: Emil Mededovic, Valdy Laurentius, Yuli Wu, Marcin Kopaczka, Zhu Chen, Mareike Schulz, René Tolba, Johannes Stegmaier.
"No Free Lunch in Annotation either: An objective evaluation of foundation models for streamlining annotation in animal tracking." ArXiv (2025). [paper] [code] [2025.02] -
CBCT-US: Feng Li, Yuan Bi, Dianye Huang, Zhongliang Jiang, Nassir Navab.
"Robotic CBCT Meets Robotic Ultrasound." ArXiv (2025). [paper] [2025.02] -
IDCC-SAM: Fanijo, Samuel, Ali Jannesari, and Julie Dickerson.
"IDCC-SAM: A Zero-Shot Approach for Cell Counting in Immunocytochemistry Dataset Using the Segment Anything Model." Bioengineering (2025). [paper] [2025.02] -
LV-SAM: Yagang Wu, Tianli Zhao, Shijun Hu, Qin Wu, Yingxu Chen, Xin Huang & Zhoushun Zheng.
"Integrating multi-scale information and diverse prompts in large model SAM-Med2D for accurate left ventricular ejection fraction estimation." Med Biol Eng Comput(2025). [paper] [2025.02] -
LangRS: Mohanad Diab and Polychronis Kolokoussis and Maria Antonia Brovelli.
"Optimizing zero-shot text-based segmentation of remote sensing imagery using SAM and Grounding DINO." Artificial Intelligence in Geosciences (2025). [paper] [code] [2025.02] -
Xia, Sijie, Rufu Qin, Yang Lu, Lianjiang Ma, and Zhenghu Liu.
"A Monocular Vision-Based Safety Monitoring Framework for Offshore Infrastructures Utilizing Grounded SAM." Journal of Marine Science and Engineering(2025). [paper] [2025.02] -
Yufang He and Bo Chen and Mahdi Motagh and Yuyan Zhu and Songdong Shao and Jiaye Li and Bing Zhang and Hermann Kaufmann.
"International Journal of Applied Earth Observation and Geoinformation." International Journal of Applied Earth Observation and Geoinformation (2025). [paper] [2025.02] -
Save: Park, Chae Jung and Nguyen, Khanh-Binh.
"Save: Segment Audio-Visual Easy Way Using The Segment Anything Model." SSRN (2025). [paper] [2025.02] -
CAB-USRI: Jinxin Shao, Haosu Zhang & Jianming Miao.
"Depthanything and SAM for UIE: exploring large model information contributes to underwater image restoration." Machine Vision and Applications (2025). [paper] [2025.02] -
REMOTE SENSING LETTERS: Hui Zhang.
"A SAM-based dual-branch network for remote sensing semantic segmentation." REMOTE SENSING LETTERS (2025). [paper] [2025.02] -
SAMCell: Alexandra D. VandeLoo, Nathan J. Malta, Emilio Aponte, Caitlin van Zyl, Danfei Xu, Craig R. Forest.
"SAMCell: Generalized Label-Free Biological Cell Segmentation with Segment Anything." ArXiv (2025). [paper] [2025.02] -
AutoMedSAM: Peng Huang, Shu Hu, Bo Peng, Jiashu Zhang, Hongtu Zhu, Xi Wu, Xin Wang.
"Diffusion-empowered AutoPrompt MedSAM." ArXiv (2025). [paper] [code] [2025.02] -
SAMRefiner: Yuqi Lin, Hengjia Li, Wenqi Shao, Zheng Yang, Jun Zhao, Xiaofei He, Ping Luo, Kaipeng Zhang.
"SAMRefiner: Taming Segment Anything Model for Universal Mask Refinement." ICLR (2025). [paper] [code] [2025.02] -
MTRMB: You Zhou, Jiangshan Zhao, Deyu Zeng, Zuo Zuo, Weixiang Liu, Zongze Wu.
"Multimodal Task Representation Memory Bank vs. Catastrophic Forgetting in Anomaly Detection." ArXiv (2025). [paper] [2025.02] -
FunduSAM: Jinchen Yu, Yongwei Nie, Fei Qi, Wenxiong Liao, Hongmin Cai.
"FunduSAM: A Specialized Deep Learning Model for Enhanced Optic Disc and Cup Segmentation in Fundus Images." ArXiv (2025). [paper] [2025.02] -
GlandSAM: Zhang, Qixiang and Li, Yi and Xue, Cheng and Wang, Haonan and Li, Xiaomeng.
"GlandSAM: Injecting Morphology Knowledge Into Segment Anything Model for Label-Free Gland Segmentation." TMI.(2025). [paper] [2025.02] -
LAM: Wei-Bin Kou, Guangxu Zhu, Rongguang Ye, Shuai Wang, Ming Tang, Yik-Chung Wu.
"Label Anything: An Interpretable, High-Fidelity and Prompt-Free Annotator." ICRA (2025). [paper] [2025.02] -
PP: Wang Xinyi, Kang Hongyu, Wei Peishan, Shuai Li, Yu Sun, Sai Kit Lam, Yongping Zheng.
"Proxy Prompt: Endowing SAM and SAM 2 with Auto-Interactive-Prompt for Medical Segmentation." ArXiv (2025). [paper] [2025.02] -
FE-UNet: Guohao Huo, Ruiting Dai, Ling Shao, Hao Tang.
"FE-UNet: Frequency Domain Enhanced U-Net with Segment Anything Capability for Versatile Image Segmentation." ArXiv (2025). [paper] [2025.02] -
ZISVFM: Ying Zhang, Maoliang Yin, Wenfu Bi, Haibao Yan, Shaohan Bian, Cui-Hua Zhang, Changchun Hua.
"ZISVFM: Zero-Shot Object Instance Segmentation in Indoor Robotic Environments with Vision Foundation Models." IEEE Transactions on Robotics (2025). [paper] [code] [2025.02] -
RFMedSAM 2: Bin Xie, Hao Tang, Yan Yan, Gady Agam.
"RFMedSAM 2: Automatic Prompt Refinement for Enhanced Volumetric Medical Image Segmentation with SAM 2." ArXiv (2025). [paper] [2025.02] -
FLIP: Manuel Traub, Martin V. Butz.
"Rethinking Vision Transformer for Object Centric Foundation Models." ArXiv (2025). [paper] [2025.02] -
Tell2Reg: Wen Yan, Qianye Yang, Shiqi Huang, Yipei Wang, Shonit Punwani, Mark Emberton, Vasilis Stavrinides, Yipeng Hu, Dean Barratt.
"Tell2Reg: Establishing spatial correspondence between images by the same language prompts." ArXiv (2025). [paper] [code] [2025.02] -
Functional-SAM: Sidak Pal Singh, Hossein Mobahi, Atish Agarwala, Yann Dauphin.
"Avoiding spurious sharpness minimization broadens applicability of SAM." ArXiv (2025). [paper] [2025.02] -
IMDPrompter: Quan Zhang, Yuxin Qi, Xi Tang, Jinwei Fang, Xi Lin, Ke Zhang, Chun Yuan.
"IMDPrompter: Adapting SAM to Image Manipulation Detection by Cross-View Automated Prompt Learning." ICLR (2025). [paper] [2025.02] -
LBG: Rohan Chacko, Nicolai Haeni, Eldar Khaliullin, Lin Sun, Douglas Lee.
"Lifting by Gaussians: A Simple, Fast and Flexible Method for 3D Instance Segmentation." WACV(2025). [paper] [2025.02] -
GFDS: Tongkun Liu, Bing Li, Xiao Jin, Yupeng Shi, Qiuying Li, Xiang Wei.
"Exploring Few-Shot Defect Segmentation in General Industrial Scenarios with Metric Learning and Vision Foundation Models." ArXiv (2025). [paper] [code] [2025.02] -
SAM-PLE: Mingyu Yang, Jitong Lu, and Hun-Seok Kim.
"SAM-guided Pseudo Label Enhancement for Multi-modal 3D Semantic Segmentation." ICRA (2025). [paper] [2025.02] -
VLP-SAM: Kosuke Sakurai, Ryotaro Shimizu, Masayuki Goto.
"Vision and Language Reference Prompt into SAM for Few-shot Segmentation." ArXiv (2025). [paper] [code] [2025.02] -
Self-Prompt-SAM: Bin Xie, Hao Tang, Dawen Cai, Yan Yan, Gady Agam.
"Self-Prompt SAM: Medical Image Segmentation via Automatic Prompt SAM Adaptation." ArXiv (2025). [paper] [2025.02] -
PEFT-SAM: Carolin Teuber, Anwai Archit, Constantin Pape.
"Parameter Efficient Fine-Tuning of Segment Anything Model." ArXiv (2025). [paper] [code] [2025.02] -
PathoSAM: Titus Griebel, Anwai Archit, Constantin Pape.
"Segment Anything for Histopathology." ArXiv (2025). [paper] [code] [2025.02] -
AVSBench-Robust: Jia Li, Wenjie Zhao, Ziru Huang, Yunhui Guo, Yapeng Tian.
"Do Audio-Visual Segmentation Models Truly Segment Sounding Objects?." ArXiv (2025). [paper] [2025.02] -
Diogo Ebert Gatti; Eduardo Lobo Lustosa Cabral.
"SISTEMAS PARA PERCEPÇÃO DO ESPAÇO LIVRE À FRENTE DE UM VEÍCULO E CÁLCULO DA DISTÂNCIA DE SEUS LIMITES." ArXiv (2025). [paper] [2025.01] -
OHIF-SAM2: Jaeyoung Cho, Aditya Rastogi, Jingyu Liu, et al.
"OHIF-SAM2: Accelerating Radiology Workflows with Segment Anything Model 2." ArXiv (2025). [paper] [code] [2025.01] -
Joseph Lundy.
"Foosball Robot Object Detection and Angle Estimation." ArXiv (2025). [paper] [2025.01] -
Niu, Ziang; Huang, Ting; Xu, Chengjia; Sun, Xinyue; Taha, Mohamed Farag; He, Yong; Qiu, Zhengjun.
"A Novel Approach to Optimize Key Limitations of Azure Kinect DK for Efficient and Precise Leaf Area Measurement." ArXiv (2025). [paper] [2025.01] -
J Valero Casas-Aljama.
"AI-powered 2D animation editor." ArXiv (2025). [paper] [2025.01] -
Tavakoli, Neda et al.
"Automated quantification of left ventricular scar volume in cardiac MRI using large vision models." Journal of Cardiovascular Magnetic Resonance (2025). [paper] [2025.01] -
Mehrnia, Mehri et al.
"Evaluating foundational 'segment anything' (Med-SAM1, Med-SAM2) deep learning models for left atrial segmentation in 3d LGE CMR." Journal of Cardiovascular Magnetic Resonance (2025). [paper] [2025.01] -
SAM2Act: Haoquan Fang, Markus Grotz, Wilbert Pumacay, Yi Ru Wang, Dieter Fox, Ranjay Krishna, Jiafei Duan.
"SAM2Act: Integrating Visual Foundation Model with A MemoryArchitecture for Robotic Manipulation." ArXiv (2025). [paper] [code] [2025.01] -
FlexiCrackNet: Xinlong Wan, Xiaoyan Jiang, Guangsheng Luo, Ferdous Sohel, Jenqneng Hwang.
"FlexiCrackNet: A Flexible Pipeline for Enhanced Crack Segmentation with General Features Transfered from SAM." ArXiv (2025). [paper] [2025.01] -
DeepSketchCamo: Ying Zang, Runlong Cao, Jianqi Zhang, Yidong Han, Ziyue Cao, Wenjun Hu, Didi Zhu, Lanyun Zhu, Zejian Li, Deyi Ji, Tianrun Chen.
"Let Human Sketches Help: Empowering Challenging Image Segmentation Task with Freehand Sketches." ArXiv (2025). [paper] [2025.01] -
Tongxu Zhang, Bei Wang.
"Point Cloud Upsampling as Statistical Shape Model for Pelvic." ArXiv (2025). [paper] [2025.01] -
Marker Track: Aimee Guo, Weihua Mao.
"Marker Track: Accurate Fiducial Marker Tracking for Evaluation of Residual Motions During Breath-Hold Radiotherapy." Biomedical Physics & Engineering Express (2024). [paper] [code] [2025.01] -
CLISC: Xiaochuan Ma, Jia Fu, Wenjun Liao, Shichuan Zhang, Guotai Wang.
"CLISC: Bridging clip and sam by enhanced cam for unsupervised brain tumor segmentation." ISBI (2025). [paper] [2025.01] -
KD-SAM: Kunal Dasharath Patil, Gowthamaan Palani, Ganapathy Krishnamurthi.
"Efficient Knowledge Distillation of SAM for Medical Image Segmentation." ArXiv (2025). [paper] [2025.01] -
EG-SAM: Longyi Chen and Xiandong Wang and Fengqin Yao and Mingchen Song and Jiaheng Zhang and Shengke Wang.
"An Edge-Guided SAM for effective complex object segmentation." Expert Systems With Applications (2025). [paper] [2025.01] -
Yijie Zhu, Shan E Ahmed Raza.
"Gland Segmentation Using SAM With Cancer Grade as a Prompt." ISBI (2025). [paper] [2025.01] -
MPG-SAM 2: Fu Rong, Meng Lan, Qian Zhang, Lefei Zhang.
"MPG-SAM 2: Adapting SAM 2 with Mask Priors and Global Context for Referring Video Object Segmentation." ArXiv (2025). [paper] [2025.01] -
Gabrielle Hoyer, Michelle W Tong, Rupsa Bhattacharjee, Valentina Pedoia, Sharmila Majumdar.
"Scalable Evaluation Framework for Foundation Models in Musculoskeletal MRI Bridging Computational Innovation with Clinical Utility." ArXiv (2025). [paper] [2025.01] -
APSAM: Jian Wang, Xiaokang Zhang, Xianping Ma, Weikang Yu, Pedram Ghamisi.
"Auto-Prompting SAM for Weakly Supervised Landslide Extraction." ArXiv (2025). [paper] [code] [2025.01] -
MONA: Boxun Hu, Mingze Xia, Ding Zhao, Guanlin Wu.
"MONA: Moving Object Detection from Videos Shot by Dynamic Camera." ArXiv (2025). [paper] [2025.01] -
DynamicEarth: Kaiyu Li, Xiangyong Cao, Yupeng Deng, Chao Pang, Zepeng Xin, Deyu Meng, Zhi Wang.
"DynamicEarth: How Far are We from Open-Vocabulary Change Detection?." ArXiv (2025). [paper] [code] [2025.01] -
fabSAM: Yufeng Xie, Hanzhi Wu, Hongxiang Tong, Lei Xiao, Wenwen Zhou, Ling Li, Thomas Cherico Wanger.
"fabSAM: A Farmland Boundary Delineation Method Based on the Segment Anything Model." ArXiv (2025). [paper] [2025.01] -
MedicoSAM: Anwai Archit, Luca Freckmann, Constantin Pape.
"MedicoSAM: Towards foundation models for medical image segmentation." ArXiv (2025). [paper] [code] [2025.01] -
UW-COT220 & VL-SAM2: Chunhui Zhang, Li Liu, Guanjie Huang, Hao Wen, Xi Zhou, Yanfeng Wang.
"Towards Underwater Camouflaged Object Tracking: Benchmark and Baselines." ArXiv (2025). [paper] [ResearchGate] [project] [2025.01] -
HOPOMOP: Michael Schwingshackl, Fabio Francisco Oberweger, Markus Murschitz.
"Few-shot Structure-Informed Machinery Part Segmentation with Foundation Models and Graph Neural Networks." WACV (2025). [paper] [code] [2025.01] -
CableSAM: Aihua Ling, Junwen Wang, Jiaming Lu & Ruyu Liu.
"CableSAM: an efficient automatic segmentation method for aircraft cabin cables." Optoelectronics Letters (2025). [paper] [2025.01] -
Wang, Yunlong, and Zhiyong Zhang.
"Segment Any Leaf 3D: A Zero-Shot 3D Leaf Instance Segmentation Method Based on Multi-View Images." Sensors (2025). [paper] [2025.01] -
SegmentAnyTooth: Khoa Dang Nguyen and Hung Trong Hoang and Thi-Phuong Hong Doan and Khai Quang Dao and Ding-Han Wang and Ming-Lun Hsu.
"SegmentAnyTooth: An open-source deep learning framework for tooth enumeration and segmentation in intraoral photos." Journal of Dental Sciences (2025). [[paper](SegmentAnyTooth: An open-source deep learning framework for tooth enumeration and segmentation in intraoral photos - ScienceDirect)] [2025.01] -
SAM-Glomeruli: Sun, Rui, and Tianzhu Zhang.
"SAM-Glomeruli: Enhanced Segment Anything Model for Precise Glomeruli." MICCAI Workshop (2024). [paper] [2025.01] -
FATE-SAM: Xingxin He, Yifan Hu, Zhaoye Zhou, Mohamed Jarraya, Fang Liu.
"Few-Shot Adaptation of Training-Free Foundation Model for 3D Medical Image Segmentation." ArXiv (2025). [paper] [2025.01] -
Pengru Deng, Jiapeng Yao, Chun Li, Su Wang, Xinrun Li, Varun Ojha, Xuhui He, Takashi Matsumoto.
"Unified Few-shot Crack Segmentation and its Precise 3D Automatic Measurement in Concrete Structures." ArXiv (2025). [paper] [2025.01] -
VRS-HQ: Sitong Gong, Yunzhi Zhuge, Lu Zhang, Zongxin Yang, Pingping Zhang, Huchuan Lu.
"The Devil is in Temporal Token: High Quality Video Reasoning Segmentation." ArXiv (2025). [paper] [code] [2025.01] -
SuperSAM: Waqwoya Abebe, Sadegh Jafari, Sixing Yu, Akash Dutta, Jan Strube, Nathan R. Tallent, Luanzheng Guo, Pablo Munoz, Ali Jannesari.
"SuperSAM: Crafting a SAM Supernetwork via Structured Pruning and Unstructured Parameter Prioritization." ArXiv (2025). [paper] [2025.01] -
SkipClick: Robin Schön, Julian Lorenz, Daniel Kienzle, Rainer Lienhart.
"SkipClick: Combining Quick Responses and Low-Level Features for Interactive Segmentation in Winter Sports Contexts." ArXiv (2025). [paper] [code] [2025.01] -
SAM-DA: Javier Gamazo Tejero, Moritz Schmid, Pablo Márquez Neila, Martin S. Zinkernagel, Sebastian Wolf, Raphael Sznitman.
"SAM-DA: Decoder Adapter for Efficient Medical Domain Adaptation." WACV (2025). [paper] [2025.01] -
PGP-SAM: Zhonghao Yan, Zijin Yin, Tianyu Lin, Xiangzhu Zeng, Kongming Liang, Zhanyu Ma.
"PGP-SAM: Prototype-Guided Prompt Learning for Efficient Few-Shot Medical Image Segmentation." ISBI (2025). [paper] [2025.01] -
SST: Zhenyang Feng, Zihe Wang, Saul Ibaven Bueno, Tomasz Frelek, Advikaa Ramesh, Jingyan Bai, Lemeng Wang, Zanming Huang, Jianyang Gu, Jinsu Yoo, Tai-Yu Pan, Arpita Chowdhury, Michelle Ramirez, Elizabeth G. Campolongo, Matthew J. Thompson, Christopher G. Lawrence, Sydne Record, Neil Rosser, Anuj Karpatne, Daniel Rubenstein, Hilmar Lapp, Charles V. Stewart, Tanya Berger-Wolf, Yu Su, Wei-Lun Chao.
"Static Segmentation by Tracking: A Frustratingly Label-Efficient Approach to Fine-Grained Segmentation." ArXiv (2025). [paper] [2025.01] -
OCORD: Shuo Zhang, Runpu Wei, Kongming Liang.
"OCORD: Open-Campus Object Removal Dataset." ArXiv (2025). [paper] [code] [2025.01] -
Guided SAM: S.B. van Rooij, G.J. Burghouts.
"Guided SAM: Label-Efficient Part Segmentation." ArXiv (2025). [paper] [2025.01] -
EdgeTAM: Chong Zhou, Chenchen Zhu, Yunyang Xiong, Saksham Suri, Fanyi Xiao, Lemeng Wu, Raghuraman Krishnamoorthi, Bo Dai, Chen Change Loy, Vikas Chandra, Bilge Soran.
"EdgeTAM: On-Device Track Anything Model." ArXiv (2025). [paper] [code] [2025.01] -
RSRefSeg: Keyan Chen, Jiafan Zhang, Chenyang Liu, Zhengxia Zou, Zhenwei Shi.
"RSRefSeg: Referring Remote Sensing Image Segmentation with Foundation Models." ArXiv (2025). [paper] [code] [2025.01] -
CCT: Olivier Morelle, Justus Bisten, Maximilian W. M. Wintergerst, Robert P. Finger, Thomas Schultz.
"Weakly Supervised Segmentation of Hyper-Reflective Foci with Compact Convolutional Transformers and SAM2." German Conference on Medical Image Computing(2025). [paper] [2025.01] -
FLAIR: Chinmay K Lalgudi, Mark E Leone, Jaden V Clark, Sergio Madrigal-Mora, Mario Espinoza.
"Zero-shot Shark Tracking and Biometrics from Aerial Imagery." ArXiv (2025). [paper] [2025.01] -
SPA: Hu, Jihong and Li, Yinhao and Jain, Rahul Kumar and Lin, Lanfen and Chen, Yen-wei.
"SPA: Leveraging the SAM with Spatial Priors Adapter for Enhanced Medical Image Segmentation." JBHI(2025). [paper] [2025.01] -
SAM-Upflow Splitter: Wenhui Liu, Yulong Qiao, Zhengyi Xing, Yue Zhao.
"Zero-shot moving ship segmentation based on segment anything network and optical flow network." ELECTRONICS LETTERS (2025). [paper] [2025.01] -
Naddaf-Sh, Amir-M., Vinay S. Baburao, and Hassan Zargarzadeh.
"Leveraging Segment Anything Model (SAM) for Weld Defect Detection in Industrial Ultrasonic B-Scan Images." Sensors (2025). [paper] [2025.01] -
Sa2VA: Haobo Yuan, Xiangtai Li, Tao Zhang, Zilong Huang, Shilin Xu, Shunping Ji, Yunhai Tong, Lu Qi, Jiashi Feng, Ming-Hsuan Yang.
"Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos." ArXiv (2025). [paper] [code] [project] [hugging face] [2025.01] -
AutoFish: Stefan Hein Bengtson, Daniel Lehotský, Vasiliki Ismiroglou, Niels Madsen, Thomas B. Moeslund, Malte Pedersen.
"AutoFish: Dataset and Benchmark for Fine-grained Analysis of Fish." WACV Workshop (2025). [paper] [code] [2025.01] -
MedFocusCLIP : Aadya Arora, Vinay Namboodiri.
"MedFocusCLIP : Improving few shot classification in medical datasets using pixel wise attention." ArXiv (2025). [paper] [code] [2025.01] -
Risha Goel, Zain Shabeeb, Isabel Panicker, Vida Jamali.
"Segment Anything Model for Zero-shot Single Particle Tracking in Liquid Phase Transmission Electron Microscopy." ArXiv (2025). [paper] [2025.01] -
SAM4EM: Javier Montalvo, Álvaro García-Martín, Pablo Carballeira, Juan C. SanMiguel.
"Unsupervised Class Generation to Expand Semantic Segmentation Datasets." ArXiv (2025). [paper] [2025.01] -
EdgeSAM: Yang, Wenya and Chen, Xiao-Diao and Wu, Wen and Qin, Hongshuai and Yan, Kangming and Mao, Xiaoyang and Song, Haichuan.
"Boosting Deep Unsupervised Edge Detection via Segment Anything Model." IEEE TII (2024). [paper] [2025.01] -
PowerSAM: Nannan Yan, Yuhao Li, Yingke Mao, et al.
"PowerSAM: Edge-Efficient Segment Anything for Power Systems Through Visual Model Distilla- tion PowerSAM: Edge-Efficient Segment Anything for Power Systems Through Visual Model Distillation." ArXiv (2025). [paper] [2025.01] -
PG-SAG: Tengfei Wang, Xin Wang, Yongmao Hou, Yiwei Xu, Wendi Zhang, Zongqian Zhan.
"PG-SAG: Parallel Gaussian Splatting for Fine-Grained Large-Scale Urban Buildings Reconstruction via Semantic-Aware Grouping." ArXiv (2025). [paper] [code] [2025.01] -
MA-SAM: D. Fan et al.
"MA-SAM: A Multi-atlas Guided SAM Using Pseudo Mask Prompts without Manual Annotation for Spine Image Segmentation." TMI (2025). [paper] [2025.01] -
ReferSAM: S. -A. Liu, H. Xie, J. Ge and Y. Zhang.
"ReferSAM: Unleashing Segment Anything Model for Referring Image Segmentation." TCSVT (2025). [paper] [code] [2025.01] -
YS3AM: Mu S, Liu J, Zhang P, et al.
"YS3AM: Adaptive 3D Reconstruction and Harvesting Target Detection for Clustered Green Asparagus." ArXiv (2025). [paper] [2025.01] -
FCP: Suho Park, SuBeen Lee, Hyun Seok Seong, Jaejoon Yoo, Jae-Pil Heo.
"Foreground-Covering Prototype Generation and Matching for SAM-Aided Few-Shot Segmentation." AAAI (2025). [paper] [code] [2025.01] -
ScarNet: Neda Tavakoli, Amir Ali Rahsepar, Brandon C. Benefield, Daming Shen, Santiago López-Tapia, Florian Schiffers, Jeffrey J. Goldberger, Christine M. Albert, Edwin Wu, Aggelos K. Katsaggelos, Daniel C. Lee, Daniel Kim.
"ScarNet: A Novel Foundation Model for Automated Myocardial Scar Quantification from LGE in Cardiac MRI." ArXiv (2025). [paper] [2025.01] -
EUGIS: Jiang Shang, Yuanmeng Wu, Xiaoxiang Han, Xi Chen and Qi Zhang.
"Evidential Calibrated Uncertainty-Guided Interactive Segmentation paradigm for Ultrasound Images." ArXiv (2025). [paper] [code] [2025.01]
No. | Project | Title | Project page | Code base | Affiliation | Description |
---|---|---|---|---|---|---|
000 | SAM | Segment Anything | Project page | Code | Meta | A foundation model for general image segmentation. |
001 | SAM2 | Segment Anything Model 2 | Project page | Code | Meta | A video foundation model. |
002 | SAM-Track | Segment and Track Anything | Colab | Code | Zhejiang University | A project dedicated to tracking and segmenting any objects in videos, either automatically or interactively. |
003 | Grounded-SAM | Grounded-Segment-Anything | Colab | Code | IDEA-Research | A project by combining Grounding DINO and SAM which aims to detect and segment Anything with text inputs. |
004 | MMDet-SAM | - | - | Code | OpenMMLab | A new way of instance segmentation by combining SAM with Closed-Set Object Detection, Open-Vocabulary Object Detection, Grounding Object Detection. |
005 | MMRotate-SAM | Zero-shot Oriented Object Detection with SAM | - | Code | OpenMMLab | A project join SAM and weakly supervised horizontal box detection to achieve rotated box detection. |
006 | MMOCR-SAM | - | - | Code | OpenMMLab | A solution of Text Detection/Recognition and SAM that segments every text character, with striking text removal and text inpainting demos driven by diffusion models and Gradio. |
007 | MMEditing-SAM | - | - | Code | OpenMMLab | A project join SAM and image generation to create awesome images and edit any part of them. |
008 | Label-Studio-SAM | OpenMMLab PlayGround: Semi-Automated Annotation with Label-Studio and SAM | - | Code | OpenMMLab | A project combining Label-Studio and SAM to achieve semi-automated annotation. |
009 | PaddleSeg | Segment Anything with PaddleSeg | - | Code | PaddlePaddle | A pretrained model parameters of PaddlePaddle format. |
010 | SegGPT | Segmenting Everything In Context | Hugging Face | Code | BAAI-Vision | SAM In Context based on Painter. |
011 | SEEM | Segment Everything Everywhere All at Once | Hugging Face | Code | Microsoft | A project can Segment Everything Everywhere with Multi-modal prompts all at once. |
012 | CLIP Surgery | CLIP Surgery for Better Explainability with Enhancement in Open Vocabulary Tasks | Project page | Code | HKUST | A work about SAM based on CLIP's explainability to achieve text to mask without manual points. |
013 | SAMCOD | Can SAM Segment Anything? When SAM Meets Camouflaged Object Detection | - | Code | - | SAM +Camouflaged object detection (COD) task. |
014 | Inpaint Anything | Segment Anything Meets Image Inpainting | Hugging Face | Code | USTC and EIT | SAM combines Inpainting, which is able to remove the object smoothly. |
015 | PerSAM | Personalize Segment Anything Model with One Shot | Hugging Face | Code | - | SAM with specific concepts. |
016 | MedSAM | Segment Anything in Medical Images | - | Code | - | A step-by-step tutorial with a small dataset to help you quickly utilize SAM. |
017 | Segment-Any-Anomaly | GroundedSAM Anomaly Detection | Colab | Code | HUST | Grounding DINO + SAM to segment any anomaly. |
018 | SSA | Semantic Segment Anything | - | Code | Fudan University | A dense category annotation engine. |
019 | Magic Copy | - | - | Code | - | Magic Copy is a Chrome extension that uses SAM to extract a foreground object from an image and copy it to the clipboard. |
020 | Segment Anything with Clip | Segment Anything with Clip | Hugging Face | Code | - | SAM combined with CLIP. |
021 | MetaSeg | Segment Anything Video | Hugging Face | Code | - | Packaged version of the SAM. |
022 | SAM in Napari | Segment Anything Model (SAM) in Napari | Project page | Code | Applied Computer Vision Lab and German Cancer Research Center | Extended SAM's click-based foreground separation to full click-based semantic segmentation and instance segmentation. |
023 | SAM Medical Imaging | SAM Medical Imaging | - | Code | - | SAM for Medical Imaging. |
024 | 3D-Box | 3D-Box via Segment Anything | - | Code | - | SAM is extended to 3D perception by combining it with VoxelNeXt. |
025 | Anything-3D | - | - | Code | - | Anything 3DNovel View, Anything-NeRF, Any 3DFace. |
026 | L2SET | Learning to Segment EveryThing | - | Code | UC Berkeley, FAIR | A new partially supervised training paradigm for instance segmentation. |
027 | Edit Anything | Edit Anything by Segment-Anything | - | Code | - | Edit anything in images powered by SAM, ControlNet, StableDiffusion, \etc. |
028 | Image Edit Anything | IEA: Image Editing Anything | - | Code | - | Using stable diffusion and SAM for image editing. |
029 | SAM for Stable Diffusion Webui | Segment Anything for Stable Diffusion WebUI | - | Code | - | This extension aim for connecting AUTOMATIC1111 Stable Diffusion WebUI and Mikubill ControlNet Extension with SAM and GroundingDINO to enhance Stable Diffusion/ControlNet inpainting. |
030 | Earth Observation Tools | Segment Anything EO tools | Colab | Code | - | An earth observation tools for SAM. |
031 | Moving Object Detection | Towards Segmenting Anything That Moves | - | Code | - | A project about SAM + Moving Object Detection. |
032 | OCR-SAM | Optical Character Recognition with Segment Anything | Project page | Code | - | Combining MMOCR with SAM and Stable Diffusion. |
033 | SALT | Segment Anything Labelling Tool | - | Code | - | A project uses the SAM Model and adds a barebones interface to label images and saves the masks in the COCO format. |
034 | Prompt Segment Anything | Prompt Segment Anything | - | Code | - | An implementation of zero-shot instance segmentation using SAM. |
035 | SAM-RBox | - | - | Code | - | A project uses SAM for generating rotated bounding boxes with MMRotate, which is a comparison method of H2RBox-v2. |
036 | VISAM | MOTRv2: Bootstrapping End-to-End Multi-Object Tracking by Pretrained Object Detectors | - | Code | - | Combining SAM with MOT, it create the era of "MOTS". |
037 | SegEO | Segment Anything EO tools | - | Code | - | The tools are developed to ease the processing of spatial data (GeoTIFF and TMS) with SAM using sliding window algorithm for big files. |
038 | Napari Segment Anything | Napari Segment Anything | Project page | Code | - | SAM native Qt UI. |
039 | Segment-Anything-U-Specify | Segment-Anything-U-Specify | - | Code | - | Using CLIP and SAM to segment any instance you specify with text prompt of any instance names. |
040 | SegDrawer | Simple static web-based mask drawer | Colab | Code | - | Simple static web-based mask drawer, supporting semantic segmentation with SAM. |
041 | Track Anything | Segment Anything Meets Videos | Hugging Face | Code | SUSTech | Track-Anything is a flexible and interactive tool for video object tracking and segmentation. |
042 | Count Anything | - | - | Code | - | A method uses SAM and CLIP to ground and count any object that matches a custom text prompt, without requiring any point or box annotation. |
043 | RAM | Relate Anything Model | Hugging Face | Code | MMLab, NTU and VisCom Lab, KCL/TongJi | Relate Anything Model is capable of taking an image as input and utilizing SAM to identify the corresponding mask within the image. |
044 | Segment Any RGBD | Segment Any RGBD | Project page | Code | - | Segment AnyRGBD is a toolbox to segment rendered depth images based on SAM. |
045 | Show Anything | Show Anything | Hugging Face | Code | Showlab, NUS | Some Applications that are compatible with both SAM and Generation. |
046 | Transfer Any Style | Any-to-Any Style Transfer: Making Picasso and Da Vinci Collaborate | - | Code | LV-lab, NUS | An interactive demo based on Segment-Anything for style transfer which enables different content regions apply different styles. |
047 | Caption Anything | - | Colab | Code | VIP lab, SUSTech | Caption-Anything is a versatile image processing tool that combines the capabilities of SAM, Visual Captioning, and ChatGPT. |
048 | Image2Paragraph | Transform Image Into Unique Paragraph | Project page | Code | - | Transform Image into Unique Paragraph with ChatGPT, BLIP2, OFA, GRIT, Segment Anything, ControlNet. |
049 | LIME SAM | Local Interpretable Model-agnostic Explanations Segment Anything | Colab | Code | - | LIME-SAM aims to create an Explainable Artificial Intelligence (XAI) framework for image classification using LIME (Local Interpretable Model-agnostic Explanations) as the base algorithm, with the super-pixel method replaced by SAM. |
050 | Paint Anything | - | - | Code | - | An interactive demo based on SAM for stroke-based painting which enables human-like painting. |
051 | SAMed | Customized Segment Anything Model for Medical Image Segmentation | Colab | Code | USTC | SAMed is built upon the large-scale image segmentation model, SAM, to explore the new research paradigm of customizing large-scale models for medical image segmentation. |
052 | Personalize SAM | Personalize Segment Anything with 1 Shot in 10 Seconds | Hugging Face | Code | MMLab, CUHK | A training-free Personalization approach for SAM, termed as PerSAM. Given only a single image with a reference mask, PerSAM can segment specific visual concepts. |
053 | Open-vocabulary-Segment-Anything | Open-vocabulary-Segment-Anything | - | Code | - | Combining OwlViT with Segment Anything - Open-vocabulary Detection and Segmentation (Text-conditioned, and Image-conditioned). |
054 | Labal-Anything-Pipeline | Label-Anything-Pipeline | - | Code | ZJU | Annotation anything in visual tasks just all in one-pipeline with GPT-4 and SAM. |
055 | Grounded-Segment-Any-Parts | Grounded Segment Anything: From Objects to Parts | Project page | Code | HKU | Expand Segment Anything Model (SAM) to support text prompt input. The text prompt could be object-level(eg, dog) and part-level(eg, dog head). |
056 | AnyLabeling | AnyLabeling | Youtube page | Code | - | Effortless AI-assisted data labeling with AI support from Segment Anything and YOLO. |
057 | SSA | Semantic-Segment-Anything | Project page | Code | - | Automated dense category annotation engine that serves as the initial semantic labeling for the Segment Anything dataset (SA-1B). |
058 | RefSAM | Label Data with Segment Anything in Roboflow | Project page | Code | - | Referring Image Segmentation Benchmarking with Segment Anything Model (SAM). |
059 | Roboflow Annotate | Launch: Label Data with Segment Anything in Roboflow | Project page | APP | Roboflow | SAM-assisted labeling for training computer vision models. |
060 | ImageBind SAM | - | - | Code | IDEA-Research | This is an experimental demo aims to combine ImageBind and SAM to generate mask with different modalities. |
061 | X-AnyLabeling | X-AnyLabeling | Code | CVHub | A new interactive automatic labeling tool based on AnyLabeling. | |
062 | Segment Anything + NNCF | - | Code | - | OpenVINO™ NNCF for segment anything encoder quantization acceleration. | |
063 | YOLOv8 + SAM | - | - | - | Use SAM in YOLOv8. | |
064 | SearchAnything | SearchAnything | Zhihu blog, Twitter | Code | CAS and MSRA | A semantic local search engine powered by various AI models. |
065 | SAM Meets Stable Diffusion | - | Code | PaddlePaddle | Segment and generate Anything. | |
066 | Language Segment-Anything | - | - | Code | - | SAM with text prompts generates masks for specific objects in images. |
067 | Expedit-SAM | - | - | Code | - | Expediting SAM without Fine-tuning. |
068 | Segment-Anything-Fast | Accelerating Generative AI with PyTorch: Segment Anything, Fast | Project page | Code | Team PyTorch | A batched offline inference oriented version of segment-anything. |
069 | YOLOv9+SAM | YOLOv9+SAM | Project page | Code | - | Dynamic Detection and Segmentation with YOLOv9+SAM. |
070 | LiteMedSAM | LiteMedSAM | Project page | Code | - | A lightweight version of MedSAM for fast training and inference. |
071 | ISAT_with_segment_anything | ISAT_with_segment_anything | Project page | Code | - | An Interactive Semi-automatic Annotation Tool based on segment anything model, supporting SAM, SAM2, SAM-HQ, MobileSAM, EdgeSAM, etc. |
- VainF/Awesome-Anything
- Hedlen/Awesome Segment Anything
- Vision-Intelligence-and-Robots-Group/Awesome-Segment-Anything
- JerryX1110/Awesome-segment-anything-extensions
- dk-liang/Awesome-Segment-Anything
This project is released under the MIT license. Please see the LICENSE file for more information.