A Comprehensive Survey on Segment Anything Model for Vision and Beyond

The First Comprehensive SAM Survey: A Comprehensive Survey on Segment Anything Model for Vision and Beyond. Chunhui Zhang, Li Liu, Yawen Cui, Guanjie Huang, Weilin Lin, Yiqian Yang, Yuehong Hu. [paper] [homepage][中文解读]

Abstract: Artificial intelligence (AI) is evolving towards artificial general intelligence, which refers to the ability of an AI system to perform a wide range of tasks and exhibit a level of intelligence similar to that of a human being. This is in contrast to narrow or specialized AI, which is designed to perform specific tasks with a high degree of efficiency. Therefore, it is urgent to design a general class of models, which we term foundation models, trained on broad data that can be adapted to various downstream tasks. The recently proposed segment anything model (SAM) has made significant progress in breaking the boundaries of segmentation, greatly promoting the development of foundation models for computer vision. To fully comprehend SAM, we conduct a survey study. As the first to comprehensively review the progress of segmenting anything task for vision and beyond based on the foundation model of SAM, this work focuses on its applications to various tasks and data types by discussing its historical development, recent progress, and profound impact on broad applications. We first introduce the background and terminology for foundation models including SAM, as well as state-of-the-art methods contemporaneous with SAM that are significant for segmenting anything task. Then, we analyze and summarize the advantages and limitations of SAM across various image processing applications, including software scenes, real-world scenes, and complex scenes. Importantly, many insights are drawn to guide future research to develop more versatile foundation models and improve the architecture of SAM. We also summarize massive other amazing applications of SAM in vision and beyond. Finally, we maintain a continuously updated paper list and an open-source project summary for foundation model SAM at here.

Awesome Segment Anything Models: A curated list of awesome segment anything models in computer vision and beyond. This repository supplements our survey paper. We intend to continuously update it.

If you like our project, please give us a star ⭐ on GitHub for latest update.

We strongly encourage authors of relevant works to make a pull request and add their paper's information [here].

💥SAM 2: Segment Anything in Images and Videos was released.

💥The first survey on SAM for videos: Segment Anything for Videos: A Systematic Survey was online.

🔥 Highlights

- 2024.07.31: The first survey on SAM for videos was online.
- 2024.07.29: The SAM 2 was released.
- 2023.07.14: "Segment Anything" was accepted by ICCV 2023.
- 2023.05.16: An initial version of recent papers and projects.
- 2023.04.05: The paper of "Segment Anything" was online.

Citation

If you find our work useful in your research, please consider citing:

@article{zhang2023comprehensive,
  title={A Comprehensive Survey on Segment Anything Model for Vision and Beyond},
  author={Zhang, Chunhui and Liu, Li and Cui, Yawen and Huang, Guanjie and Lin, Weilin and Yang, Yiqian and Hu, Yuehong},
  journal={arXiv preprint arXiv:2305.08196},
  year={2023}
}

@article{zhang2024segment,
  title={Segment Anything for Videos: A Systematic Survey},
  author={Zhang, Chunhui and Cui, Yawen and Lin, Weilin and Huang, Guanjie and Rong, Yan and Liu, Li and Shan, Shiguang},
  journal={arXiv preprint arXiv:2408.08315},
  year={2024}
}

Survey

The first comprehensive SAM survey: Chunhui Zhang, Li Liu, Yawen Cui, Guanjie Huang, Weilin Lin, Yiqian Yang, Yuehong Hu.
"A Comprehensive Survey on Segment Anything Model for Vision and Beyond." ArXiv (2024). [paper] [homepage] [中文解读] [2023.05]
SAM for Videos: Chunhui Zhang, Yawen Cui, Weilin Lin, Guanjie Huang, Yan Rong, Li Liu, Shiguang Shan.
"Segment Anything for Videos: A Systematic Survey." ArXiv (2024). [ArXiv] [ChinaXiv] [ResearchGate] [Project] [中文解读] [2024.07]
SAM4MIS: Yichi Zhang, Rushi Jiao.
"Towards Segment Anything Model (SAM) for Medical Image Segmentation: A Survey." CBM (2024). [paper] [project] [2023.05]
Yichi Zhang, Zhenrong Shen.
"Unleashing the Potential of SAM2 for Biomedical Images and Videos: A Survey." ArXiv (2024). [paper] [code] [2024.08]
Tianfei Zhou, Fei Zhang, Boyu Chang, Wenguan Wang, Ye Yuan, Ender Konukoglu, Daniel Cremers.
"Image Segmentation in Foundation Model Era: A Survey." ArXiv (2024). [paper] [2024.08]
Chaoning Zhang, Fachrina Dewi Puspitasari, Sheng Zheng, Chenghao Li, Yu Qiao, Taegoo Kang, Xinru Shan, Chenshuang Zhang, Caiyan Qin, Francois Rameau, Lik-Hang Lee, Sung-Ho Bae, Choong Seon Hong.
"A Survey on Segment Anything Model (SAM): Vision Foundation Model Meets Prompt Engineering." ArXiv (2024). [paper] [2023.05]
Mudassar Ali and Tong Wu and Haoji Hu and Qiong Luo and Dong Xu and Weizeng Zheng and Neng Jin and Chen Yang and Jincao Yao.
"A review of the Segment Anything Model (SAM) for medical image analysis: Accomplishments and perspectives." Computerized Medical Imaging and Graphics (2024). [paper] [2024.12]

Paper List

Seminal Papers

SAM: Alexander Kirillov, Eric Mintun, Nikhila Ravi, Hanzi Mao, Chloe Rolland, Laura Gustafson, Tete Xiao, Spencer Whitehead, Alexander C. Berg, Wan-Yen Lo, Piotr Dollár, Ross Girshick.
"Segment Anything." ICCV (2023) Best Paper Honorable Mention. [paper] [homepage] [code] [Zhihu] [Reddit] [2023.04]
SAM 2: Nikhila Ravi∗,†, Valentin Gabeur∗, Yuan-Ting Hu∗, Ronghang Hu∗, Chaitanya Ryali∗, Tengyu Ma∗, Haitham Khedr∗, Roman Rädle∗ Chloe Rolland, Laura Gustafson, Eric Mintun, Junting Pan, Kalyan Vasudev Alwala, Nicolas Carion, Chao-Yuan Wu, Ross Girshick, Piotr Dollár†, Christoph Feichtenhofer∗,†.
"SAM 2: Segment Anything in Images and Videos." ArXiv (2024). [paper] [demo]] [code] [project]] [dataset] [blog] [2024.07]
GPT-4V: OpenAI.
"GPT-4V(ision) System Card." ArXiv (2023). [paper] [homepage] [2023.09]
Gemini: Gemini Team, Googl.
"Gemini: A Family of Highly Capable Multimodal Models." ArXiv (2023). [paper] [homepage] [blog] [2023.12]
SEEM: Xueyan Zou, Jianwei Yang, Hao Zhang, Feng Li, Linjie Li, Jianfeng Gao, Yong Jae Lee.
"Segment Everything Everywhere All at Once." NeurIPS (2023). [paper] [code] [2023.04]
SegGPT: Xinlong Wang, Xiaosong Zhang, Yue Cao, Wen Wang, Chunhua Shen, Tiejun Huang.
"SegGPT: Segmenting Everything In Context." ICCV (2023). [paper] [code] [2023.04]
Grounding DINO: Shilong Liu, Zhaoyang Zeng, Tianhe Ren, Feng Li, Hao Zhang, Jie Yang, Chunyuan Li, Jianwei Yang, Hang Su, Jun Zhu, Lei Zhang.
"Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection." ArXiv (2023). [paper] [code] [2023.04]
ImageBind: Rohit Girdhar, Alaaeldin El-Nouby, Zhuang Liu, Mannat Singh, Kalyan Vasudev Alwala, Armand Joulin, Ishan Misra.
"ImageBind: One Embedding Space To Bind Them All." CVPR (2023). [paper] [homepage] [code] [2023.05]
LanguageBind: Bin Zhu, Bin Lin, Munan Ning, Yang Yan, Jiaxi Cui, HongFa Wang, Yatian Pang, Wenhao Jiang, Junwu Zhang, Zongwei Li, Wancai Zhang, Zhifeng Li, Wei Liu, Li Yuan.
"LanguageBind: Extending Video-Language Pretraining to N-modality by Language-based Semantic Alignment." ArXiv (2023). [paper] [code]
Meta-Transformer: Yiyuan Zhang, Kaixiong Gong, Kaipeng Zhang, Hongsheng Li, Yu Qiao, Wanli Ouyang, Xiangyu Yue.
"Meta-Transformer: A Unified Framework for Multimodal Learning." ArXiv (2023). [paper] [homepage] [code] [中文解读] [2023.07]
OpenSeeD: Hao Zhang, Feng Li, Xueyan Zou, Shilong Liu, Chunyuan Li, Jianfeng Gao, Jianwei Yang, Lei Zhang.
"A Simple Framework for Open-Vocabulary Segmentation and Detection." ICCV (2023). [paper] [code] [2023.03]
RAM: Youcai Zhang, Xinyu Huang, Jinyu Ma, Zhaoyang Li, Zhaochuan Luo, Yanchun Xie, Yuzhuo Qin, Tong Luo, Yaqian Li, Shilong Liu, Yandong Guo, Lei Zhang.
"Recognize Anything: A Strong Image Tagging Model." ArXiv (2023). [paper] [homepage] [code] [2023.06]
PACGen: Yuheng Li, Haotian Liu, Yangming Wen, Yong Jae Lee.
"Generate Anything Anywhere in Any Scene." ArXiv (2023). [paper] [homepage] [code] [2023.06]
ASM: Weiyun Wang, Min Shi, Qingyun Li, Wenhai Wang, Zhenhang Huang, Linjie Xing, Zhe Chen, Hao Li, Xizhou Zhu, Zhiguo Cao, Yushi Chen, Tong Lu, Jifeng Dai, Yu Qiao.
"The All-Seeing Project: Towards Panoptic Visual Recognition and Understanding of the Open World." ArXiv (2023). [paper] [homepage] [demo] [2023.08]
OneFormer: Jitesh Jain, Jiachen Li, MangTik Chiu, Ali Hassani, Nikita Orlov, Humphrey Shi.
"OneFormer: One Transformer to Rule Universal Image Segmentation." CVPR (2023). [paper] [homepage] [code] [2022.11]
OVSeg: Feng Liang, Bichen Wu, Xiaoliang Dai, Kunpeng Li, Yinan Zhao, Hang Zhang, Peizhao Zhang, Peter Vajda, Diana Marculescu.
"Open-Vocabulary Semantic Segmentation with Mask-adapted CLIP." CVPR (2023). [paper] [homepage] [code] [2022.10]
WAM: Tom Sander, Pierre Fernandez, Alain Durmus, Teddy Furon, Matthijs Douze.
"Watermark Anything with Localized Messages." ArXiv (2024). [paper] [code] [2024.11]
Sa2VA: Haobo Yuan, Xiangtai Li, Tao Zhang, Zilong Huang, Shilin Xu, Shunping Ji, Yunhai Tong, Lu Qi, Jiashi Feng, Ming-Hsuan Yang.
"Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos." ArXiv (2025). [paper] [code] [project] [hugging face] [2025.01]

Follow-up Papers

The latest papers within a week are marked with a 💥.

2025

💥AHCPTQ: Wenlun Zhang, Shimpei Ando, Kentaro Yoshioka.
"AHCPTQ: Accurate and Hardware-Compatible Post-Training Quantization for Segment Anything Model." ArXiv (2025). [paper] [2025.03]

💥SPD-VFM: Pengchen Liang, Leijun Shi, Huiping Yao, Bin Pu, Jianguo Chen, Lei Zhao, Haishan Huang, Zhuangzhuang Chen, Zhaozhao Xu, Lite Xu, Qing Chang, Yiwei Li.
"Semantic Prior Distillation with Vision Foundation Model for Enhanced Rapid Bone Scintigraphy Image Restoration." ArXiv (2025). [paper] [2025.03]

💥SHIFNet : Jiayi Zhao, Fei Teng, Kai Luo, Guoqiang Zhao, Zhiyong Li, Xu Zheng, Kailun Yang.
"Unveiling the Potential of Segment Anything Model 2 for RGB-Thermal Semantic Segmentation with Language Guidance." ArXiv (2025). [paper] [code] [2025.03]

💥ReID-SAM: Kunjun Li, Cheng-Yen Yang, Hsiang-Wei Huang, Jenq-Neng Hwang.
"Technical Report for ReID-SAM on SkiTB Visual Tracking Challenge 2025." ArXiv (2025). [paper] [2025.03]

💥Clayton Bromley, Alexander Moore, Amar Saini, Doug Poland, Carmen Carrano.
"An Analysis of Segment Anything 2." ArXiv (2025). [paper] [2025.03]

💥SAGE: Guanyao Wu, Haoyu Liu, Hongming Fu, Yichuan Peng, Jinyuan Liu, Xin Fan, Risheng Liu.
"Every SAM Drop Counts: Embracing Semantic Priors for Multi-Modality Image Fusion and Beyond." ArXiv (2025). [paper] [2025.03]

💥SparseMamba-PCL: Luyi Qiu, Tristan Till, Xiaobao Guo, Adams Wai-Kin Kong.
"SparseMamba-PCL: Scribble-Supervised Medical Image Segmentation via SAM-Guided Progressive Collaborative Learning." ArXiv (2025). [paper] [code] [2025.03]

💥SemiSAM+: Yichi Zhang, Bohao Lv, Le Xue, Wenbo Zhang, Yuchen Liu, Yu Fu, Yuan Cheng, Yuan Qi.
"SemiSAM+: Rethinking Semi-Supervised Medical Image Segmentation in the Era of Foundation Models." MIA (2025). [paper] [2025.02]

💥Silius M. Vandeskog, Magne Aldrin, Daniel Howell, Edvin Fuglebakk.
"Adding smoothing splines to the SAM model improves stock assessment." ArXiv (2025). [paper] [2025.02]

💥Utku Ozbulak, Seyed Amir Mousavi, Francesca Tozzi, Nikdokht Rashidian, Wouter Willaert, Wesley De Neve, Joris Vankerschaver.
"Less is More? Revisiting the Importance of Frame Rate in Real-Time Zero-Shot Surgical Video Segmentation." ArXiv (2025). [paper] [2025.02]

BudSAM: Zhou, Chenxi and Wan, Tianjiao and Xu, Kele and Qiao, Peng and Dou, Yong.
"Segment Anything for Visual Bird Sound Denoising." IEEE SPL (2025). [paper] [code] [2025.02]
LORENZA: Yehonathan Refael, Iftach Arbel, Ofir Lindenbaum, Tom Tirer.
"LORENZA: Enhancing Generalization in Low-Rank Gradient LLM Training and Fine-Tuning via Efficient Zeroth-Order Adaptive SAM Optimization." ArXiv (2025). [paper] [2025.02]
HumanCLIP: Keito Suzuki, Bang Du, Girish Krishnan, Kunyao Chen, Runfa Blark Li, Truong Nguyen.
"Open-Vocabulary Semantic Part Segmentation of 3D Human." 3DV (2025). [paper] [2025.02]
CLIP+Grad-CAM+SAM: Muhammad A. Muttaqien, Tomohiro Motoda, Ryo Hanai, Domae Yukiyasu.
"Attention-Guided Integration of CLIP and SAM for Precise Object Masking in Robotic Manipulation." 2025 IEEE/SICE International Symposium on System Integration (2025). [paper] [2025.02]
VesselSAM: Adnan Iltaf, Rayan Merghani Ahmed, Bin Li, Shoujun Zhou.
"VesselSAM: Leveraging SAM for Aortic Vessel Segmentation with LoRA and Atrous Attention." ArXiv (2025). [paper] [code] [2025.02]
DICEPTION: Canyu Zhao, Mingyu Liu, Huanyi Zheng, Muzhi Zhu, Zhiyue Zhao, Hao Chen, Tong He, Chunhua Shen.
"DICEPTION: A Generalist Diffusion Model for Visual Perceptual Tasks." ArXiv (2025). [paper] [code] [project] [2025.02]
AV2T-SAM: Kyungbok Lee, You Zhang, Zhiyao Duan.
"AUDIO VISUAL SEGMENTATION THROUGH TEXT EMBEDDINGS." ArXiv (2025). [paper] [2025.02]
LVM-MSC: Feibo Jiang, Siwei Tu, Li Dong, Kezhi Wang, Kun Yang, Ruiqi Liu, Cunhua Pan, Jiangzhou Wang.
"Lightweight Vision Model-based Multi-user Semantic Communication Systems." ArXiv (2025). [paper] [2025.02]
USegMix: Jiamu Wang, Jin Tae Kwak.
"USegMix: Unsupervised Segment Mix for Efficient Data Augmentation in Pathology Images." ArXiv (2025). [paper] [2025.02]
SESSRS: Qiao, Yang and Zhong, Bo and Du, Bailin and Cai, He and Jiang, Jinxiong and Liu, Qinhuo and Yang, Aixia and Wu, Junjun and Wang, Xiaoya.
"SAM Enhanced Semantic Segmentation for Remote Sensing Imagery Without Additional Training." TGRS (2025). [paper] [code] [2025.02]
UrbanSAM: Chenyu Li, Danfeng Hong, Bing Zhang, Yuxuan Li, Gustau Camps-Valls, Xiao Xiang Zhu, Jocelyn Chanussot.
"UrbanSAM: Learning Invariance-Inspired Adapters for Segment Anything Models in Urban Construction." ArXiv (2025). [paper] [code] [2025.02]
Ufaq Khan, Umair Nawaz, Adnan Qayyum, Shazad Ashraf, Muhammad Bilal, Junaid Qadir.
"Surgical Scene Understanding in the Era of Foundation AI Models: A Comprehensive Review." ArXiv (2025). [paper] [2025.02]
YOLO-SAM: Tianyou Jiang, Mingshun Shao, Tianyi Zhang, Xiaoyu Liu, Qun Yu.
"Soybean pod and seed counting in both outdoor fields and indoor laboratories using unions of deep neural networks." ArXiv (2025). [paper] [2025.02]
SIYO: Mayankeyshwar, Mridul and Kumar, Lookinder and Wagh, Mamata P. and Behuria, Swatishree and Yadav, Dev.
"Brain Tumor Detection and Segmentation using SAM integrated YOLOv9 Scheme." ASPCC (2024). [paper] [2025.02]
FieldSeg: Lucas B. Ferreira and Vitor S. Martins and Uilson R.V. Aires and Nuwan Wijewardane and Xin Zhang and Sathish Samiappan.
"FieldSeg: A scalable agricultural field extraction framework based on the Segment Anything Model and 10-m Sentinel-2 imagery." Computers and Electronics in Agriculture (2025). [paper] [2025.02]
GDPGO-SAM: Hua, Shuzhen, Biao Yang, Xinchang Zhang, Ji Qi, Fengxi Su, Jing Sun, and Yongjian Ruan.
"GDPGO-SAM: An Unsupervised Fine Segmentation of Desert Vegetation Driven by Grounding DINO Prompt Generation and Optimization Segment Anything Model." Remote Sensing (2025). [paper] [2025.02]
Raphael Stock, et al.
"Segment Anything in Medical Images with nnUNet." ArXiv (2025). [paper] [2025.02]
MedfcientSAM: Bao-Hiep Le, et al.
"MedfcientSAM: A Robust Medical Segmentation Model with Optimized Inference Pipeline for Limited Clinical Settings." ArXiv (2025). [paper] [code] [2025.02]
SegAnyPET: Yichi Zhang, Le Xue, Wenbo Zhang, Lanlan Li, Yuchen Liu, Chen Jiang, Yuan Cheng, Yuan Qi.
"SegAnyPET: Universal Promptable Segmentation from Positron Emission Tomography Images." ArXiv (2025). [paper] [code] [2025.02]
Pengchen Liang, Bin Pu, Haishan Huang, Yiwei Li, Hualiang Wang, Weibo Ma, Qing Chang.
"Vision Foundation Models in Medical Image Analysis: Advances and Challenges." ArXiv (2025). [paper] [2025.02]
SASVi: Ssharvien Kumar Sivakumar, Yannik Frisch, Amin Ranem, Anirban Mukhopadhyay.
"SASVi - Segment Any Surgical Video." ArXiv (2025). [paper] [2025.02]
SpeHeatal: Yi Shi, Yunkai Wang, Xupeng Tian, Tieyi Zhang, Bing Yao, Hui Wang, Yong Shao, Cencen Wang, Rong Zeng.
"SpeHeatal: A Cluster-Enhanced Segmentation Method for Sperm Morphology Analysis." AAAI (2025). [paper] [2025.02]
MaizeEar-SAM: Hossein Zaremehrjerdi, Lisa Coffey, Talukder Jubery, Huyu Liu, Jon Turkus, Kyle Linders, James C. Schnable, Patrick S. Schnable, Baskar Ganapathysubramanian.
"MaizeEar-SAM: Zero-Shot Maize Ear Phenotyping." ArXiv (2025). [paper] [2025.02]
PRISM: Kangning Cui, Rongkun Zhu, Manqi Wang, Wei Tang, Gregory D. Larsen, Victor P. Pauca, Sarra Alqahtani, Fan Yang, David Segurado, David Lutz, Jean-Michel Morel, Miles R. Silman.
"Detection and Geographic Localization of Natural Objects in the Wild: A Case Study on Palms." ArXiv (2025). [paper] [2025.02]
SAM-Assisted-Registration: Hao Xu, Tengfei Xue, Jianan Fan, Dongnan Liu, Yuqian Chen, Fan Zhang, Carl-Fredrik Westin, Ron Kikinis, Lauren J. O'Donnell, Weidong Cai.
"Medical Image Registration Meets Vision Foundation Model: Prototype Learning and Contour Awareness." IPMI (2025). [paper] [code] [2025.02]
WRT-SAM: Yunyi Zhou, Kun Shi, Gang Hao.
"WRT-SAM: Foundation Model-Driven Segmentation for Generalized Weld Radiographic Testing ." ArXiv (2025). [paper] [2025.02]
MITO: Laura Dodds, Tara Boroushaki, Fadel Adib.
"MITO: Enabling Non-Line-of-Sight Perception using Millimeter-waves through Real-World Datasets and Simulation Tools." ArXiv (2025). [paper] [2025.02]
SAM2Refiner: Yuan Yao, Qiushi Yang, Miaomiao Cui, Liefeng Bo.
"Towards Fine-grained Interactive Segmentation in Images and Videos." ArXiv (2025). [paper] [2025.02]
SAM-QA: Emil Mededovic, Valdy Laurentius, Yuli Wu, Marcin Kopaczka, Zhu Chen, Mareike Schulz, René Tolba, Johannes Stegmaier.
"No Free Lunch in Annotation either: An objective evaluation of foundation models for streamlining annotation in animal tracking." ArXiv (2025). [paper] [code] [2025.02]
CBCT-US: Feng Li, Yuan Bi, Dianye Huang, Zhongliang Jiang, Nassir Navab.
"Robotic CBCT Meets Robotic Ultrasound." ArXiv (2025). [paper] [2025.02]
IDCC-SAM: Fanijo, Samuel, Ali Jannesari, and Julie Dickerson.
"IDCC-SAM: A Zero-Shot Approach for Cell Counting in Immunocytochemistry Dataset Using the Segment Anything Model." Bioengineering (2025). [paper] [2025.02]
LV-SAM: Yagang Wu, Tianli Zhao, Shijun Hu, Qin Wu, Yingxu Chen, Xin Huang & Zhoushun Zheng.
"Integrating multi-scale information and diverse prompts in large model SAM-Med2D for accurate left ventricular ejection fraction estimation." Med Biol Eng Comput(2025). [paper] [2025.02]
LangRS: Mohanad Diab and Polychronis Kolokoussis and Maria Antonia Brovelli.
"Optimizing zero-shot text-based segmentation of remote sensing imagery using SAM and Grounding DINO." Artificial Intelligence in Geosciences (2025). [paper] [code] [2025.02]
Xia, Sijie, Rufu Qin, Yang Lu, Lianjiang Ma, and Zhenghu Liu.
"A Monocular Vision-Based Safety Monitoring Framework for Offshore Infrastructures Utilizing Grounded SAM." Journal of Marine Science and Engineering(2025). [paper] [2025.02]
Yufang He and Bo Chen and Mahdi Motagh and Yuyan Zhu and Songdong Shao and Jiaye Li and Bing Zhang and Hermann Kaufmann.
"International Journal of Applied Earth Observation and Geoinformation." International Journal of Applied Earth Observation and Geoinformation (2025). [paper] [2025.02]
Save: Park, Chae Jung and Nguyen, Khanh-Binh.
"Save: Segment Audio-Visual Easy Way Using The Segment Anything Model." SSRN (2025). [paper] [2025.02]
CAB-USRI: Jinxin Shao, Haosu Zhang & Jianming Miao.
"Depthanything and SAM for UIE: exploring large model information contributes to underwater image restoration." Machine Vision and Applications (2025). [paper] [2025.02]
REMOTE SENSING LETTERS: Hui Zhang.
"A SAM-based dual-branch network for remote sensing semantic segmentation." REMOTE SENSING LETTERS (2025). [paper] [2025.02]
SAMCell: Alexandra D. VandeLoo, Nathan J. Malta, Emilio Aponte, Caitlin van Zyl, Danfei Xu, Craig R. Forest.
"SAMCell: Generalized Label-Free Biological Cell Segmentation with Segment Anything." ArXiv (2025). [paper] [2025.02]
AutoMedSAM: Peng Huang, Shu Hu, Bo Peng, Jiashu Zhang, Hongtu Zhu, Xi Wu, Xin Wang.
"Diffusion-empowered AutoPrompt MedSAM." ArXiv (2025). [paper] [code] [2025.02]
SAMRefiner: Yuqi Lin, Hengjia Li, Wenqi Shao, Zheng Yang, Jun Zhao, Xiaofei He, Ping Luo, Kaipeng Zhang.
"SAMRefiner: Taming Segment Anything Model for Universal Mask Refinement." ICLR (2025). [paper] [code] [2025.02]
MTRMB: You Zhou, Jiangshan Zhao, Deyu Zeng, Zuo Zuo, Weixiang Liu, Zongze Wu.
"Multimodal Task Representation Memory Bank vs. Catastrophic Forgetting in Anomaly Detection." ArXiv (2025). [paper] [2025.02]
FunduSAM: Jinchen Yu, Yongwei Nie, Fei Qi, Wenxiong Liao, Hongmin Cai.
"FunduSAM: A Specialized Deep Learning Model for Enhanced Optic Disc and Cup Segmentation in Fundus Images." ArXiv (2025). [paper] [2025.02]
GlandSAM: Zhang, Qixiang and Li, Yi and Xue, Cheng and Wang, Haonan and Li, Xiaomeng.
"GlandSAM: Injecting Morphology Knowledge Into Segment Anything Model for Label-Free Gland Segmentation." TMI.(2025). [paper] [2025.02]
LAM: Wei-Bin Kou, Guangxu Zhu, Rongguang Ye, Shuai Wang, Ming Tang, Yik-Chung Wu.
"Label Anything: An Interpretable, High-Fidelity and Prompt-Free Annotator." ICRA (2025). [paper] [2025.02]
PP: Wang Xinyi, Kang Hongyu, Wei Peishan, Shuai Li, Yu Sun, Sai Kit Lam, Yongping Zheng.
"Proxy Prompt: Endowing SAM and SAM 2 with Auto-Interactive-Prompt for Medical Segmentation." ArXiv (2025). [paper] [2025.02]
FE-UNet: Guohao Huo, Ruiting Dai, Ling Shao, Hao Tang.
"FE-UNet: Frequency Domain Enhanced U-Net with Segment Anything Capability for Versatile Image Segmentation." ArXiv (2025). [paper] [2025.02]
ZISVFM: Ying Zhang, Maoliang Yin, Wenfu Bi, Haibao Yan, Shaohan Bian, Cui-Hua Zhang, Changchun Hua.
"ZISVFM: Zero-Shot Object Instance Segmentation in Indoor Robotic Environments with Vision Foundation Models." IEEE Transactions on Robotics (2025). [paper] [code] [2025.02]
RFMedSAM 2: Bin Xie, Hao Tang, Yan Yan, Gady Agam.
"RFMedSAM 2: Automatic Prompt Refinement for Enhanced Volumetric Medical Image Segmentation with SAM 2." ArXiv (2025). [paper] [2025.02]
FLIP: Manuel Traub, Martin V. Butz.
"Rethinking Vision Transformer for Object Centric Foundation Models." ArXiv (2025). [paper] [2025.02]
Tell2Reg: Wen Yan, Qianye Yang, Shiqi Huang, Yipei Wang, Shonit Punwani, Mark Emberton, Vasilis Stavrinides, Yipeng Hu, Dean Barratt.
"Tell2Reg: Establishing spatial correspondence between images by the same language prompts." ArXiv (2025). [paper] [code] [2025.02]
Functional-SAM: Sidak Pal Singh, Hossein Mobahi, Atish Agarwala, Yann Dauphin.
"Avoiding spurious sharpness minimization broadens applicability of SAM." ArXiv (2025). [paper] [2025.02]
IMDPrompter: Quan Zhang, Yuxin Qi, Xi Tang, Jinwei Fang, Xi Lin, Ke Zhang, Chun Yuan.
"IMDPrompter: Adapting SAM to Image Manipulation Detection by Cross-View Automated Prompt Learning." ICLR (2025). [paper] [2025.02]
LBG: Rohan Chacko, Nicolai Haeni, Eldar Khaliullin, Lin Sun, Douglas Lee.
"Lifting by Gaussians: A Simple, Fast and Flexible Method for 3D Instance Segmentation." WACV(2025). [paper] [2025.02]
GFDS: Tongkun Liu, Bing Li, Xiao Jin, Yupeng Shi, Qiuying Li, Xiang Wei.
"Exploring Few-Shot Defect Segmentation in General Industrial Scenarios with Metric Learning and Vision Foundation Models." ArXiv (2025). [paper] [code] [2025.02]
SAM-PLE: Mingyu Yang, Jitong Lu, and Hun-Seok Kim.
"SAM-guided Pseudo Label Enhancement for Multi-modal 3D Semantic Segmentation." ICRA (2025). [paper] [2025.02]
VLP-SAM: Kosuke Sakurai, Ryotaro Shimizu, Masayuki Goto.
"Vision and Language Reference Prompt into SAM for Few-shot Segmentation." ArXiv (2025). [paper] [code] [2025.02]
Self-Prompt-SAM: Bin Xie, Hao Tang, Dawen Cai, Yan Yan, Gady Agam.
"Self-Prompt SAM: Medical Image Segmentation via Automatic Prompt SAM Adaptation." ArXiv (2025). [paper] [2025.02]
PEFT-SAM: Carolin Teuber, Anwai Archit, Constantin Pape.
"Parameter Efficient Fine-Tuning of Segment Anything Model." ArXiv (2025). [paper] [code] [2025.02]
PathoSAM: Titus Griebel, Anwai Archit, Constantin Pape.
"Segment Anything for Histopathology." ArXiv (2025). [paper] [code] [2025.02]
AVSBench-Robust: Jia Li, Wenjie Zhao, Ziru Huang, Yunhui Guo, Yapeng Tian.
"Do Audio-Visual Segmentation Models Truly Segment Sounding Objects?." ArXiv (2025). [paper] [2025.02]
Diogo Ebert Gatti; Eduardo Lobo Lustosa Cabral.
"SISTEMAS PARA PERCEPÇÃO DO ESPAÇO LIVRE À FRENTE DE UM VEÍCULO E CÁLCULO DA DISTÂNCIA DE SEUS LIMITES." ArXiv (2025). [paper] [2025.01]
OHIF-SAM2: Jaeyoung Cho, Aditya Rastogi, Jingyu Liu, et al.
"OHIF-SAM2: Accelerating Radiology Workflows with Segment Anything Model 2." ArXiv (2025). [paper] [code] [2025.01]
Joseph Lundy.
"Foosball Robot Object Detection and Angle Estimation." ArXiv (2025). [paper] [2025.01]
Niu, Ziang; Huang, Ting; Xu, Chengjia; Sun, Xinyue; Taha, Mohamed Farag; He, Yong; Qiu, Zhengjun.
"A Novel Approach to Optimize Key Limitations of Azure Kinect DK for Efficient and Precise Leaf Area Measurement." ArXiv (2025). [paper] [2025.01]
J Valero Casas-Aljama.
"AI-powered 2D animation editor." ArXiv (2025). [paper] [2025.01]
Tavakoli, Neda et al.
"Automated quantification of left ventricular scar volume in cardiac MRI using large vision models." Journal of Cardiovascular Magnetic Resonance (2025). [paper] [2025.01]
Mehrnia, Mehri et al.
"Evaluating foundational 'segment anything' (Med-SAM1, Med-SAM2) deep learning models for left atrial segmentation in 3d LGE CMR." Journal of Cardiovascular Magnetic Resonance (2025). [paper] [2025.01]
SAM2Act: Haoquan Fang, Markus Grotz, Wilbert Pumacay, Yi Ru Wang, Dieter Fox, Ranjay Krishna, Jiafei Duan.
"SAM2Act: Integrating Visual Foundation Model with A MemoryArchitecture for Robotic Manipulation." ArXiv (2025). [paper] [code] [2025.01]
FlexiCrackNet: Xinlong Wan, Xiaoyan Jiang, Guangsheng Luo, Ferdous Sohel, Jenqneng Hwang.
"FlexiCrackNet: A Flexible Pipeline for Enhanced Crack Segmentation with General Features Transfered from SAM." ArXiv (2025). [paper] [2025.01]
DeepSketchCamo: Ying Zang, Runlong Cao, Jianqi Zhang, Yidong Han, Ziyue Cao, Wenjun Hu, Didi Zhu, Lanyun Zhu, Zejian Li, Deyi Ji, Tianrun Chen.
"Let Human Sketches Help: Empowering Challenging Image Segmentation Task with Freehand Sketches." ArXiv (2025). [paper] [2025.01]
Tongxu Zhang, Bei Wang.
"Point Cloud Upsampling as Statistical Shape Model for Pelvic." ArXiv (2025). [paper] [2025.01]
Marker Track: Aimee Guo, Weihua Mao.
"Marker Track: Accurate Fiducial Marker Tracking for Evaluation of Residual Motions During Breath-Hold Radiotherapy." Biomedical Physics & Engineering Express (2024). [paper] [code] [2025.01]
CLISC: Xiaochuan Ma, Jia Fu, Wenjun Liao, Shichuan Zhang, Guotai Wang.
"CLISC: Bridging clip and sam by enhanced cam for unsupervised brain tumor segmentation." ISBI (2025). [paper] [2025.01]
KD-SAM: Kunal Dasharath Patil, Gowthamaan Palani, Ganapathy Krishnamurthi.
"Efficient Knowledge Distillation of SAM for Medical Image Segmentation." ArXiv (2025). [paper] [2025.01]
EG-SAM: Longyi Chen and Xiandong Wang and Fengqin Yao and Mingchen Song and Jiaheng Zhang and Shengke Wang.
"An Edge-Guided SAM for effective complex object segmentation." Expert Systems With Applications (2025). [paper] [2025.01]
Yijie Zhu, Shan E Ahmed Raza.
"Gland Segmentation Using SAM With Cancer Grade as a Prompt." ISBI (2025). [paper] [2025.01]
MPG-SAM 2: Fu Rong, Meng Lan, Qian Zhang, Lefei Zhang.
"MPG-SAM 2: Adapting SAM 2 with Mask Priors and Global Context for Referring Video Object Segmentation." ArXiv (2025). [paper] [2025.01]
Gabrielle Hoyer, Michelle W Tong, Rupsa Bhattacharjee, Valentina Pedoia, Sharmila Majumdar.
"Scalable Evaluation Framework for Foundation Models in Musculoskeletal MRI Bridging Computational Innovation with Clinical Utility." ArXiv (2025). [paper] [2025.01]
APSAM: Jian Wang, Xiaokang Zhang, Xianping Ma, Weikang Yu, Pedram Ghamisi.
"Auto-Prompting SAM for Weakly Supervised Landslide Extraction." ArXiv (2025). [paper] [code] [2025.01]
MONA: Boxun Hu, Mingze Xia, Ding Zhao, Guanlin Wu.
"MONA: Moving Object Detection from Videos Shot by Dynamic Camera." ArXiv (2025). [paper] [2025.01]
DynamicEarth: Kaiyu Li, Xiangyong Cao, Yupeng Deng, Chao Pang, Zepeng Xin, Deyu Meng, Zhi Wang.
"DynamicEarth: How Far are We from Open-Vocabulary Change Detection?." ArXiv (2025). [paper] [code] [2025.01]
fabSAM: Yufeng Xie, Hanzhi Wu, Hongxiang Tong, Lei Xiao, Wenwen Zhou, Ling Li, Thomas Cherico Wanger.
"fabSAM: A Farmland Boundary Delineation Method Based on the Segment Anything Model." ArXiv (2025). [paper] [2025.01]
MedicoSAM: Anwai Archit, Luca Freckmann, Constantin Pape.
"MedicoSAM: Towards foundation models for medical image segmentation." ArXiv (2025). [paper] [code] [2025.01]
UW-COT220 & VL-SAM2: Chunhui Zhang, Li Liu, Guanjie Huang, Hao Wen, Xi Zhou, Yanfeng Wang.
"Towards Underwater Camouflaged Object Tracking: Benchmark and Baselines." ArXiv (2025). [paper] [ResearchGate] [project] [2025.01]
HOPOMOP: Michael Schwingshackl, Fabio Francisco Oberweger, Markus Murschitz.
"Few-shot Structure-Informed Machinery Part Segmentation with Foundation Models and Graph Neural Networks." WACV (2025). [paper] [code] [2025.01]
CableSAM: Aihua Ling, Junwen Wang, Jiaming Lu & Ruyu Liu.
"CableSAM: an efficient automatic segmentation method for aircraft cabin cables." Optoelectronics Letters (2025). [paper] [2025.01]
Wang, Yunlong, and Zhiyong Zhang.
"Segment Any Leaf 3D: A Zero-Shot 3D Leaf Instance Segmentation Method Based on Multi-View Images." Sensors (2025). [paper] [2025.01]
SegmentAnyTooth: Khoa Dang Nguyen and Hung Trong Hoang and Thi-Phuong Hong Doan and Khai Quang Dao and Ding-Han Wang and Ming-Lun Hsu.
"SegmentAnyTooth: An open-source deep learning framework for tooth enumeration and segmentation in intraoral photos." Journal of Dental Sciences (2025). [[paper](SegmentAnyTooth: An open-source deep learning framework for tooth enumeration and segmentation in intraoral photos - ScienceDirect)] [2025.01]
SAM-Glomeruli: Sun, Rui, and Tianzhu Zhang.
"SAM-Glomeruli: Enhanced Segment Anything Model for Precise Glomeruli." MICCAI Workshop (2024). [paper] [2025.01]
FATE-SAM: Xingxin He, Yifan Hu, Zhaoye Zhou, Mohamed Jarraya, Fang Liu.
"Few-Shot Adaptation of Training-Free Foundation Model for 3D Medical Image Segmentation." ArXiv (2025). [paper] [2025.01]
Pengru Deng, Jiapeng Yao, Chun Li, Su Wang, Xinrun Li, Varun Ojha, Xuhui He, Takashi Matsumoto.
"Unified Few-shot Crack Segmentation and its Precise 3D Automatic Measurement in Concrete Structures." ArXiv (2025). [paper] [2025.01]
VRS-HQ: Sitong Gong, Yunzhi Zhuge, Lu Zhang, Zongxin Yang, Pingping Zhang, Huchuan Lu.
"The Devil is in Temporal Token: High Quality Video Reasoning Segmentation." ArXiv (2025). [paper] [code] [2025.01]
SuperSAM: Waqwoya Abebe, Sadegh Jafari, Sixing Yu, Akash Dutta, Jan Strube, Nathan R. Tallent, Luanzheng Guo, Pablo Munoz, Ali Jannesari.
"SuperSAM: Crafting a SAM Supernetwork via Structured Pruning and Unstructured Parameter Prioritization." ArXiv (2025). [paper] [2025.01]
SkipClick: Robin Schön, Julian Lorenz, Daniel Kienzle, Rainer Lienhart.
"SkipClick: Combining Quick Responses and Low-Level Features for Interactive Segmentation in Winter Sports Contexts." ArXiv (2025). [paper] [code] [2025.01]
SAM-DA: Javier Gamazo Tejero, Moritz Schmid, Pablo Márquez Neila, Martin S. Zinkernagel, Sebastian Wolf, Raphael Sznitman.
"SAM-DA: Decoder Adapter for Efficient Medical Domain Adaptation." WACV (2025). [paper] [2025.01]
PGP-SAM: Zhonghao Yan, Zijin Yin, Tianyu Lin, Xiangzhu Zeng, Kongming Liang, Zhanyu Ma.
"PGP-SAM: Prototype-Guided Prompt Learning for Efficient Few-Shot Medical Image Segmentation." ISBI (2025). [paper] [2025.01]
SST: Zhenyang Feng, Zihe Wang, Saul Ibaven Bueno, Tomasz Frelek, Advikaa Ramesh, Jingyan Bai, Lemeng Wang, Zanming Huang, Jianyang Gu, Jinsu Yoo, Tai-Yu Pan, Arpita Chowdhury, Michelle Ramirez, Elizabeth G. Campolongo, Matthew J. Thompson, Christopher G. Lawrence, Sydne Record, Neil Rosser, Anuj Karpatne, Daniel Rubenstein, Hilmar Lapp, Charles V. Stewart, Tanya Berger-Wolf, Yu Su, Wei-Lun Chao.
"Static Segmentation by Tracking: A Frustratingly Label-Efficient Approach to Fine-Grained Segmentation." ArXiv (2025). [paper] [2025.01]
OCORD: Shuo Zhang, Runpu Wei, Kongming Liang.
"OCORD: Open-Campus Object Removal Dataset." ArXiv (2025). [paper] [code] [2025.01]
Guided SAM: S.B. van Rooij, G.J. Burghouts.
"Guided SAM: Label-Efficient Part Segmentation." ArXiv (2025). [paper] [2025.01]
EdgeTAM: Chong Zhou, Chenchen Zhu, Yunyang Xiong, Saksham Suri, Fanyi Xiao, Lemeng Wu, Raghuraman Krishnamoorthi, Bo Dai, Chen Change Loy, Vikas Chandra, Bilge Soran.
"EdgeTAM: On-Device Track Anything Model." ArXiv (2025). [paper] [code] [2025.01]
RSRefSeg: Keyan Chen, Jiafan Zhang, Chenyang Liu, Zhengxia Zou, Zhenwei Shi.
"RSRefSeg: Referring Remote Sensing Image Segmentation with Foundation Models." ArXiv (2025). [paper] [code] [2025.01]
CCT: Olivier Morelle, Justus Bisten, Maximilian W. M. Wintergerst, Robert P. Finger, Thomas Schultz.
"Weakly Supervised Segmentation of Hyper-Reflective Foci with Compact Convolutional Transformers and SAM2." German Conference on Medical Image Computing(2025). [paper] [2025.01]
FLAIR: Chinmay K Lalgudi, Mark E Leone, Jaden V Clark, Sergio Madrigal-Mora, Mario Espinoza.
"Zero-shot Shark Tracking and Biometrics from Aerial Imagery." ArXiv (2025). [paper] [2025.01]
SPA: Hu, Jihong and Li, Yinhao and Jain, Rahul Kumar and Lin, Lanfen and Chen, Yen-wei.
"SPA: Leveraging the SAM with Spatial Priors Adapter for Enhanced Medical Image Segmentation." JBHI(2025). [paper] [2025.01]
SAM-Upflow Splitter: Wenhui Liu, Yulong Qiao, Zhengyi Xing, Yue Zhao.
"Zero-shot moving ship segmentation based on segment anything network and optical flow network." ELECTRONICS LETTERS (2025). [paper] [2025.01]
Naddaf-Sh, Amir-M., Vinay S. Baburao, and Hassan Zargarzadeh.
"Leveraging Segment Anything Model (SAM) for Weld Defect Detection in Industrial Ultrasonic B-Scan Images." Sensors (2025). [paper] [2025.01]
Sa2VA: Haobo Yuan, Xiangtai Li, Tao Zhang, Zilong Huang, Shilin Xu, Shunping Ji, Yunhai Tong, Lu Qi, Jiashi Feng, Ming-Hsuan Yang.
"Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos." ArXiv (2025). [paper] [code] [project] [hugging face] [2025.01]
AutoFish: Stefan Hein Bengtson, Daniel Lehotský, Vasiliki Ismiroglou, Niels Madsen, Thomas B. Moeslund, Malte Pedersen.
"AutoFish: Dataset and Benchmark for Fine-grained Analysis of Fish." WACV Workshop (2025). [paper] [code] [2025.01]
MedFocusCLIP : Aadya Arora, Vinay Namboodiri.
"MedFocusCLIP : Improving few shot classification in medical datasets using pixel wise attention." ArXiv (2025). [paper] [code] [2025.01]
Risha Goel, Zain Shabeeb, Isabel Panicker, Vida Jamali.
"Segment Anything Model for Zero-shot Single Particle Tracking in Liquid Phase Transmission Electron Microscopy." ArXiv (2025). [paper] [2025.01]
SAM4EM: Javier Montalvo, Álvaro García-Martín, Pablo Carballeira, Juan C. SanMiguel.
"Unsupervised Class Generation to Expand Semantic Segmentation Datasets." ArXiv (2025). [paper] [2025.01]
EdgeSAM: Yang, Wenya and Chen, Xiao-Diao and Wu, Wen and Qin, Hongshuai and Yan, Kangming and Mao, Xiaoyang and Song, Haichuan.
"Boosting Deep Unsupervised Edge Detection via Segment Anything Model." IEEE TII (2024). [paper] [2025.01]
PowerSAM: Nannan Yan, Yuhao Li, Yingke Mao, et al.
"PowerSAM: Edge-Efficient Segment Anything for Power Systems Through Visual Model Distilla- tion PowerSAM: Edge-Efficient Segment Anything for Power Systems Through Visual Model Distillation." ArXiv (2025). [paper] [2025.01]
PG-SAG: Tengfei Wang, Xin Wang, Yongmao Hou, Yiwei Xu, Wendi Zhang, Zongqian Zhan.
"PG-SAG: Parallel Gaussian Splatting for Fine-Grained Large-Scale Urban Buildings Reconstruction via Semantic-Aware Grouping." ArXiv (2025). [paper] [code] [2025.01]
MA-SAM: D. Fan et al.
"MA-SAM: A Multi-atlas Guided SAM Using Pseudo Mask Prompts without Manual Annotation for Spine Image Segmentation." TMI (2025). [paper] [2025.01]
ReferSAM: S. -A. Liu, H. Xie, J. Ge and Y. Zhang.
"ReferSAM: Unleashing Segment Anything Model for Referring Image Segmentation." TCSVT (2025). [paper] [code] [2025.01]
YS3AM: Mu S, Liu J, Zhang P, et al.
"YS3AM: Adaptive 3D Reconstruction and Harvesting Target Detection for Clustered Green Asparagus." ArXiv (2025). [paper] [2025.01]
FCP: Suho Park, SuBeen Lee, Hyun Seok Seong, Jaejoon Yoo, Jae-Pil Heo.
"Foreground-Covering Prototype Generation and Matching for SAM-Aided Few-Shot Segmentation." AAAI (2025). [paper] [code] [2025.01]
ScarNet: Neda Tavakoli, Amir Ali Rahsepar, Brandon C. Benefield, Daming Shen, Santiago López-Tapia, Florian Schiffers, Jeffrey J. Goldberger, Christine M. Albert, Edwin Wu, Aggelos K. Katsaggelos, Daniel C. Lee, Daniel Kim.
"ScarNet: A Novel Foundation Model for Automated Myocardial Scar Quantification from LGE in Cardiac MRI." ArXiv (2025). [paper] [2025.01]
EUGIS: Jiang Shang, Yuanmeng Wu, Xiaoxiang Han, Xi Chen and Qi Zhang.
"Evidential Calibrated Uncertainty-Guided Interactive Segmentation paradigm for Ultrasound Images." ArXiv (2025). [paper] [code] [2025.01]

2024

Paper list 2024

2023

Paper list 2023

Open Source Projects

No.	Project	Title	Project page	Code base	Affiliation	Description
000	SAM	Segment Anything	Project page	Code	Meta	A foundation model for general image segmentation.
001	SAM2	Segment Anything Model 2	Project page	Code	Meta	A video foundation model.
002	SAM-Track	Segment and Track Anything	Colab	Code	Zhejiang University	A project dedicated to tracking and segmenting any objects in videos, either automatically or interactively.
003	Grounded-SAM	Grounded-Segment-Anything	Colab	Code	IDEA-Research	A project by combining Grounding DINO and SAM which aims to detect and segment Anything with text inputs.
004	MMDet-SAM	-	-	Code	OpenMMLab	A new way of instance segmentation by combining SAM with Closed-Set Object Detection, Open-Vocabulary Object Detection, Grounding Object Detection.
005	MMRotate-SAM	Zero-shot Oriented Object Detection with SAM	-	Code	OpenMMLab	A project join SAM and weakly supervised horizontal box detection to achieve rotated box detection.
006	MMOCR-SAM	-	-	Code	OpenMMLab	A solution of Text Detection/Recognition and SAM that segments every text character, with striking text removal and text inpainting demos driven by diffusion models and Gradio.
007	MMEditing-SAM	-	-	Code	OpenMMLab	A project join SAM and image generation to create awesome images and edit any part of them.
008	Label-Studio-SAM	OpenMMLab PlayGround: Semi-Automated Annotation with Label-Studio and SAM	-	Code	OpenMMLab	A project combining Label-Studio and SAM to achieve semi-automated annotation.
009	PaddleSeg	Segment Anything with PaddleSeg	-	Code	PaddlePaddle	A pretrained model parameters of PaddlePaddle format.
010	SegGPT	Segmenting Everything In Context	Hugging Face	Code	BAAI-Vision	SAM In Context based on Painter.
011	SEEM	Segment Everything Everywhere All at Once	Hugging Face	Code	Microsoft	A project can Segment Everything Everywhere with Multi-modal prompts all at once.
012	CLIP Surgery	CLIP Surgery for Better Explainability with Enhancement in Open Vocabulary Tasks	Project page	Code	HKUST	A work about SAM based on CLIP's explainability to achieve text to mask without manual points.
013	SAMCOD	Can SAM Segment Anything? When SAM Meets Camouflaged Object Detection	-	Code	-	SAM +Camouflaged object detection (COD) task.
014	Inpaint Anything	Segment Anything Meets Image Inpainting	Hugging Face	Code	USTC and EIT	SAM combines Inpainting, which is able to remove the object smoothly.
015	PerSAM	Personalize Segment Anything Model with One Shot	Hugging Face	Code	-	SAM with specific concepts.
016	MedSAM	Segment Anything in Medical Images	-	Code	-	A step-by-step tutorial with a small dataset to help you quickly utilize SAM.
017	Segment-Any-Anomaly	GroundedSAM Anomaly Detection	Colab	Code	HUST	Grounding DINO + SAM to segment any anomaly.
018	SSA	Semantic Segment Anything	-	Code	Fudan University	A dense category annotation engine.
019	Magic Copy	-	-	Code	-	Magic Copy is a Chrome extension that uses SAM to extract a foreground object from an image and copy it to the clipboard.
020	Segment Anything with Clip	Segment Anything with Clip	Hugging Face	Code	-	SAM combined with CLIP.
021	MetaSeg	Segment Anything Video	Hugging Face	Code	-	Packaged version of the SAM.
022	SAM in Napari	Segment Anything Model (SAM) in Napari	Project page	Code	Applied Computer Vision Lab and German Cancer Research Center	Extended SAM's click-based foreground separation to full click-based semantic segmentation and instance segmentation.
023	SAM Medical Imaging	SAM Medical Imaging	-	Code	-	SAM for Medical Imaging.
024	3D-Box	3D-Box via Segment Anything	-	Code	-	SAM is extended to 3D perception by combining it with VoxelNeXt.
025	Anything-3D	-	-	Code	-	Anything 3DNovel View, Anything-NeRF, Any 3DFace.
026	L2SET	Learning to Segment EveryThing	-	Code	UC Berkeley, FAIR	A new partially supervised training paradigm for instance segmentation.
027	Edit Anything	Edit Anything by Segment-Anything	-	Code	-	Edit anything in images powered by SAM, ControlNet, StableDiffusion, \etc.
028	Image Edit Anything	IEA: Image Editing Anything	-	Code	-	Using stable diffusion and SAM for image editing.
029	SAM for Stable Diffusion Webui	Segment Anything for Stable Diffusion WebUI	-	Code	-	This extension aim for connecting AUTOMATIC1111 Stable Diffusion WebUI and Mikubill ControlNet Extension with SAM and GroundingDINO to enhance Stable Diffusion/ControlNet inpainting.
030	Earth Observation Tools	Segment Anything EO tools	Colab	Code	-	An earth observation tools for SAM.
031	Moving Object Detection	Towards Segmenting Anything That Moves	-	Code	-	A project about SAM + Moving Object Detection.
032	OCR-SAM	Optical Character Recognition with Segment Anything	Project page	Code	-	Combining MMOCR with SAM and Stable Diffusion.
033	SALT	Segment Anything Labelling Tool	-	Code	-	A project uses the SAM Model and adds a barebones interface to label images and saves the masks in the COCO format.
034	Prompt Segment Anything	Prompt Segment Anything	-	Code	-	An implementation of zero-shot instance segmentation using SAM.
035	SAM-RBox	-	-	Code	-	A project uses SAM for generating rotated bounding boxes with MMRotate, which is a comparison method of H2RBox-v2.
036	VISAM	MOTRv2: Bootstrapping End-to-End Multi-Object Tracking by Pretrained Object Detectors	-	Code	-	Combining SAM with MOT, it create the era of "MOTS".
037	SegEO	Segment Anything EO tools	-	Code	-	The tools are developed to ease the processing of spatial data (GeoTIFF and TMS) with SAM using sliding window algorithm for big files.
038	Napari Segment Anything	Napari Segment Anything	Project page	Code	-	SAM native Qt UI.
039	Segment-Anything-U-Specify	Segment-Anything-U-Specify	-	Code	-	Using CLIP and SAM to segment any instance you specify with text prompt of any instance names.
040	SegDrawer	Simple static web-based mask drawer	Colab	Code	-	Simple static web-based mask drawer, supporting semantic segmentation with SAM.
041	Track Anything	Segment Anything Meets Videos	Hugging Face	Code	SUSTech	Track-Anything is a flexible and interactive tool for video object tracking and segmentation.
042	Count Anything	-	-	Code	-	A method uses SAM and CLIP to ground and count any object that matches a custom text prompt, without requiring any point or box annotation.
043	RAM	Relate Anything Model	Hugging Face	Code	MMLab, NTU and VisCom Lab, KCL/TongJi	Relate Anything Model is capable of taking an image as input and utilizing SAM to identify the corresponding mask within the image.
044	Segment Any RGBD	Segment Any RGBD	Project page	Code	-	Segment AnyRGBD is a toolbox to segment rendered depth images based on SAM.
045	Show Anything	Show Anything	Hugging Face	Code	Showlab, NUS	Some Applications that are compatible with both SAM and Generation.
046	Transfer Any Style	Any-to-Any Style Transfer: Making Picasso and Da Vinci Collaborate	-	Code	LV-lab, NUS	An interactive demo based on Segment-Anything for style transfer which enables different content regions apply different styles.
047	Caption Anything	-	Colab	Code	VIP lab, SUSTech	Caption-Anything is a versatile image processing tool that combines the capabilities of SAM, Visual Captioning, and ChatGPT.
048	Image2Paragraph	Transform Image Into Unique Paragraph	Project page	Code	-	Transform Image into Unique Paragraph with ChatGPT, BLIP2, OFA, GRIT, Segment Anything, ControlNet.
049	LIME SAM	Local Interpretable Model-agnostic Explanations Segment Anything	Colab	Code	-	LIME-SAM aims to create an Explainable Artificial Intelligence (XAI) framework for image classification using LIME (Local Interpretable Model-agnostic Explanations) as the base algorithm, with the super-pixel method replaced by SAM.
050	Paint Anything	-	-	Code	-	An interactive demo based on SAM for stroke-based painting which enables human-like painting.
051	SAMed	Customized Segment Anything Model for Medical Image Segmentation	Colab	Code	USTC	SAMed is built upon the large-scale image segmentation model, SAM, to explore the new research paradigm of customizing large-scale models for medical image segmentation.
052	Personalize SAM	Personalize Segment Anything with 1 Shot in 10 Seconds	Hugging Face	Code	MMLab, CUHK	A training-free Personalization approach for SAM, termed as PerSAM. Given only a single image with a reference mask, PerSAM can segment specific visual concepts.
053	Open-vocabulary-Segment-Anything	Open-vocabulary-Segment-Anything	-	Code	-	Combining OwlViT with Segment Anything - Open-vocabulary Detection and Segmentation (Text-conditioned, and Image-conditioned).
054	Labal-Anything-Pipeline	Label-Anything-Pipeline	-	Code	ZJU	Annotation anything in visual tasks just all in one-pipeline with GPT-4 and SAM.
055	Grounded-Segment-Any-Parts	Grounded Segment Anything: From Objects to Parts	Project page	Code	HKU	Expand Segment Anything Model (SAM) to support text prompt input. The text prompt could be object-level(eg, dog) and part-level(eg, dog head).
056	AnyLabeling	AnyLabeling	Youtube page	Code	-	Effortless AI-assisted data labeling with AI support from Segment Anything and YOLO.
057	SSA	Semantic-Segment-Anything	Project page	Code	-	Automated dense category annotation engine that serves as the initial semantic labeling for the Segment Anything dataset (SA-1B).
058	RefSAM	Label Data with Segment Anything in Roboflow	Project page	Code	-	Referring Image Segmentation Benchmarking with Segment Anything Model (SAM).
059	Roboflow Annotate	Launch: Label Data with Segment Anything in Roboflow	Project page	APP	Roboflow	SAM-assisted labeling for training computer vision models.
060	ImageBind SAM	-	-	Code	IDEA-Research	This is an experimental demo aims to combine ImageBind and SAM to generate mask with different modalities.
061	X-AnyLabeling	X-AnyLabeling	WeChat	Code	CVHub	A new interactive automatic labeling tool based on AnyLabeling.
062	Segment Anything + NNCF	-	WeChat	Code	-	OpenVINO™ NNCF for segment anything encoder quantization acceleration.
063	YOLOv8 + SAM	-	WeChat	-	-	Use SAM in YOLOv8.
064	SearchAnything	SearchAnything	Zhihu blog, Twitter	Code	CAS and MSRA	A semantic local search engine powered by various AI models.
065	SAM Meets Stable Diffusion	-	WeChat	Code	PaddlePaddle	Segment and generate Anything.
066	Language Segment-Anything	-	-	Code	-	SAM with text prompts generates masks for specific objects in images.
067	Expedit-SAM	-	-	Code	-	Expediting SAM without Fine-tuning.
068	Segment-Anything-Fast	Accelerating Generative AI with PyTorch: Segment Anything, Fast	Project page	Code	Team PyTorch	A batched offline inference oriented version of segment-anything.
069	YOLOv9+SAM	YOLOv9+SAM	Project page	Code	-	Dynamic Detection and Segmentation with YOLOv9+SAM.
070	LiteMedSAM	LiteMedSAM	Project page	Code	-	A lightweight version of MedSAM for fast training and inference.
071	ISAT_with_segment_anything	ISAT_with_segment_anything	Project page	Code	-	An Interactive Semi-automatic Annotation Tool based on segment anything model, supporting SAM, SAM2, SAM-HQ, MobileSAM, EdgeSAM, etc.

Awesome Repositories for SAM

License

This project is released under the MIT license. Please see the LICENSE file for more information.

Name		Name	Last commit message	Last commit date
Latest commit History 698 Commits
Paper_List		Paper_List
imgs		imgs
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

A Comprehensive Survey on Segment Anything Model for Vision and Beyond

If you like our project, please give us a star ⭐ on GitHub for latest update.

We strongly encourage authors of relevant works to make a pull request and add their paper's information [here].

🔥 Highlights

Contents

Citation

Survey

Paper List

Seminal Papers

Follow-up Papers

The latest papers within a week are marked with a 💥.

2025

2024

2023

Open Source Projects

Awesome Repositories for SAM

License

About

Releases

Packages

Contributors 6

License

liliu-avril/Awesome-Segment-Anything

Folders and files

Latest commit

History

Repository files navigation

A Comprehensive Survey on Segment Anything Model for Vision and Beyond

If you like our project, please give us a star ⭐ on GitHub for latest update.

We strongly encourage authors of relevant works to make a pull request and add their paper's information [here].

🔥 Highlights

Contents

Citation

Survey

Paper List

Seminal Papers

Follow-up Papers

The latest papers within a week are marked with a 💥.

2025

2024

2023

Open Source Projects

Awesome Repositories for SAM

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 6

Packages