UTF-404

SAM2 ๋‹ค์–‘ํ•˜๊ฒŒ ์ด์šฉํ•ด๋ณด๊ธฐ!! ๋ณธ๋ฌธ

private study

SAM2 ๋‹ค์–‘ํ•˜๊ฒŒ ์ด์šฉํ•ด๋ณด๊ธฐ!!

UTF-404 2025. 4. 11. 11:25
728x90

๐ŸŽฏ SAM ๊ธฐ๋ฐ˜ ์ด๋ฏธ์ง€ ๋ฐ ๋น„๋””์˜ค ์ถ”์ ๊ธฐ | Image & Video Tracker using SAM

์ด ํ”„๋กœ์ ํŠธ๋Š” Meta์˜ Segment Anything Model (SAM)์„ ๊ธฐ๋ฐ˜์œผ๋กœ ์ด๋ฏธ์ง€ ๋ฐ ์˜์ƒ์—์„œ ํŠน์ • ์ƒ‰์ƒ์„ ๊ธฐ์ค€์œผ๋กœ ๊ฐ์ฒด๋ฅผ ๋ถ„ํ• ํ•˜๊ณ  ์ถ”์ ํ•˜๋Š” Python ์Šคํฌ๋ฆฝํŠธ ๋ชจ์Œ์ž…๋‹ˆ๋‹ค.
This project uses Meta’s Segment Anything Model (SAM) to segment and track objects in images and videos based on color.

Meta์—์„œ ๊ณต๊ฐœํ•œ ๊ฐ•๋ ฅํ•œ ๋ฒ”์šฉ ๋ถ„ํ•  ๋ชจ๋ธ SAM์€ ๊ฐ์ฒด์˜ ๊ฒฝ๊ณ„์™€ ํ˜•ํƒœ๋ฅผ ๊ณ ์ •๋œ ํ”„๋กฌํ”„ํŠธ ์—†์ด๋„ ์ •ํ™•ํ•˜๊ฒŒ ์‹๋ณ„ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๋ณธ ํ”„๋กœ์ ํŠธ๋Š” ์ด ๊ธฐ๋Šฅ์„ ์ƒ‰์ƒ ๊ธฐ๋ฐ˜ ๊ฐ์ฒด ์ถ”์ ์— ์‘์šฉํ•˜์—ฌ, ๋‹ค์–‘ํ•œ ์ด๋ฏธ์ง€ ๋ฐ ์˜์ƒ ๋‚ด ์‘์šฉ ๊ฐ€๋Šฅ์„ฑ์„ ํƒ์ƒ‰ํ•ฉ๋‹ˆ๋‹ค.
SAM by Meta can detect object boundaries with high precision even without predefined prompts. This project applies SAM for color-based object tracking, showcasing its flexibility in both image and video domains.


๐Ÿ“ ํ”„๋กœ์ ํŠธ ๊ตฌ์„ฑ | Project Structure

ํŒŒ์ผ๋ช… (Filename) ์„ค๋ช… (Description)
image_color_sam2.py ๋งˆ์šฐ์Šค ํด๋ฆญ์œผ๋กœ ์„ ํƒํ•œ ์ƒ‰์ƒ ์˜์—ญ์„ ๊ธฐ์ค€์œผ๋กœ ์ด๋ฏธ์ง€์—์„œ ๊ฐ์ฒด๋ฅผ ๋ถ„ํ• ํ•ฉ๋‹ˆ๋‹ค. SAM ๋ชจ๋ธ์„ ํ™œ์šฉํ•˜์—ฌ ๋งˆ์Šคํฌ๋ฅผ ์ƒ์„ฑํ•˜๊ณ , ํ•ด๋‹น ๊ฐ์ฒด์— ์ƒ‰์ƒ์„ ์ž…ํž™๋‹ˆ๋‹ค.
Interactive color-based object segmentation on static images using SAM.
video_color_sam2.py ์˜์ƒ ๋‚ด ํŠน์ • ์ƒ‰์ƒ์„ ๊ธฐ์ค€์œผ๋กœ SAM์„ ์ด์šฉํ•˜์—ฌ ๊ฐ์ฒด๋ฅผ ์ถ”์ ํ•ฉ๋‹ˆ๋‹ค. ๋งˆ์šฐ์Šค๋กœ ์ƒ‰์ƒ์„ ์„ ํƒํ•˜๋ฉด ํ”„๋ ˆ์ž„๋งˆ๋‹ค ์œ ์‚ฌ ๊ฐ์ฒด๋ฅผ ์ถ”์ ํ•˜๋ฉฐ ๋งˆ์Šคํฌ๋ฅผ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค.
Color-based object tracking in video streams using SAM and mouse-guided selection.
tracking_sam2.py ์ค‘๋ณต ๊ฐ์ฒด ์ œ๊ฑฐ, ์ค‘์‹ฌ์  ๊ณ„์‚ฐ, ๊ฑฐ๋ฆฌ ๊ธฐ๋ฐ˜ ๋ณด์ • ๋“ฑ ์ •๊ตํ•œ ์ถ”์  ๊ธฐ๋Šฅ์„ ํฌํ•จํ•œ SAM ๊ธฐ๋ฐ˜ ๊ฐ์ฒด ์ถ”์  ์Šคํฌ๋ฆฝํŠธ์ž…๋‹ˆ๋‹ค.
Advanced tracking using SAM with centroid detection, duplicate filtering, and repulsion adjustment.
video_sam2_demo.py ๋งˆ์šฐ์Šค๋กœ ํด๋ฆญํ•œ ๊ฐ์ฒด๋ฅผ SAM์œผ๋กœ ์ถ”์ ํ•˜๋Š” ์˜์ƒ ๋ฐ๋ชจ์ž…๋‹ˆ๋‹ค. ๋‹จ์ˆœํ•œ ์‹คํ—˜์šฉ UI ๋ฐ ๊ธฐ๋Šฅ ํ™•์ธ์šฉ์œผ๋กœ ์ ํ•ฉํ•ฉ๋‹ˆ๋‹ค.
Simple video demonstration of SAM tracking triggered by mouse input.
test_sam2.py Kalman ํ•„ํ„ฐ ์ƒ์„ฑ, ์ค‘๋ณต ์ œ๊ฑฐ, ์ค‘์‹ฌ์  ๊ณ„์‚ฐ ๋“ฑ SAM๊ณผ ๋‹ค์–‘ํ•œ ๊ธฐ๋Šฅ์„ ํ…Œ์ŠคํŠธํ•˜๊ธฐ ์œ„ํ•œ ์ข…ํ•ฉ ํ…Œ์ŠคํŠธ์šฉ ์Šคํฌ๋ฆฝํŠธ์ž…๋‹ˆ๋‹ค.
Multi-function test script including SAM, Kalman filter creation, and object duplicate handling.
color_run_sam2.py ๋‹ค์–‘ํ•œ ์ƒ‰์ƒ์„ ์ž„์˜๋กœ ์ƒ์„ฑํ•˜๊ณ , ํด๋ฆญ ์ด๋ฒคํŠธ ๊ธฐ๋ฐ˜ ๊ฐ์ฒด ์ถ”์ ์„ ์‹คํ–‰ํ•ฉ๋‹ˆ๋‹ค. ์ƒ‰์ƒ ๊ธฐ๋ฐ˜ ๋งˆ์Šคํฌ ์ €์žฅ ๊ธฐ๋Šฅ ํฌํ•จ.
Interactive color-triggered segmentation with random color generation and save functionality.
last_test_sam2.py ๋‹ค์–‘ํ•œ ๊ธฐ๋Šฅ์„ ํ†ตํ•ฉํ•œ ์ตœ์ข… ํ…Œ์ŠคํŠธ ์Šคํฌ๋ฆฝํŠธ๋กœ, ๋งˆ์Šคํฌ ์‹œ๊ฐํ™” ๋ฐ ํฌ์ธํŠธ ๊ธฐ๋ฐ˜ ์ถ”์  ํ™•์ธ์šฉ์ž…๋‹ˆ๋‹ค.
Final integration test script including mask display and point-based interaction.

๐Ÿ”ง ์„ค์น˜ ๋ฐฉ๋ฒ• | Installation

1๏ธโƒฃ Python ๊ฐ€์ƒํ™˜๊ฒฝ ์ƒ์„ฑ | Create a virtual environment

conda create -n sam_env python=3.10 -y
conda activate sam_env

2๏ธโƒฃ ํ•„์ˆ˜ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ ์„ค์น˜ | Install dependencies

pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
pip install opencv-python matplotlib
pip install git+https://github.com/facebookresearch/segment-anything.git

3๏ธโƒฃ SAM ์‚ฌ์ „ํ•™์Šต ๋ชจ๋ธ ๋‹ค์šด๋กœ๋“œ | Download SAM pretrained model

Meta์˜ ๊ณต์‹ ๋ชจ๋ธ์€ ์•„๋ž˜ ๊ฒฝ๋กœ์—์„œ ๋ฐ›์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
Download the pretrained model from Meta using:

wget https://dl.fbaipublicfiles.com/segment_anything/sam_vit_h.pth -P ./checkpoints

SAM์˜ ์ž‘๋™ ๋ฐฉ์‹์— ๋Œ€ํ•œ ์ž์„ธํ•œ ๋‚ด์šฉ์€ Segment Anything ๋…ผ๋ฌธ๊ณผ ๊ณต์‹ GitHub ๋ฌธ์„œ๋ฅผ ์ฐธ๊ณ ํ•˜์„ธ์š”.
For more details on how SAM works, refer to the Segment Anything Paper and official GitHub.


โ–ถ๏ธ ์‚ฌ์šฉ ๋ฐฉ๋ฒ• | How to Use

๐Ÿ“ท ์ด๋ฏธ์ง€ ์ƒ‰์ƒ ๊ธฐ๋ฐ˜ ๋ถ„ํ•  | Image color-based segmentation

python image_color_sam2.py --image_path path/to/image.jpg

๐ŸŽฅ ์˜์ƒ ์ƒ‰์ƒ ๊ธฐ๋ฐ˜ ์ถ”์  | Video color-based tracking

python video_color_sam2.py --video_path path/to/video.mp4

๐Ÿงช ์˜์ƒ ๋ฐ๋ชจ ์‹คํ–‰ | Run SAM video demo

python video_sam2_demo.py --video_path path/to/sample.mp4

์‚ฌ์šฉ์ž๋Š” ์˜์ƒ/์ด๋ฏธ์ง€ ์† ๊ฐ์ฒด์˜ ์ƒ‰์ƒ์„ ์„ ํƒํ•˜๊ณ  ํ•ด๋‹น ์ƒ‰์ƒ๊ณผ ์œ ์‚ฌํ•œ ํ”ฝ์…€ ๊ตฐ์ง‘์— ๋Œ€ํ•ด SAM์„ ํ†ตํ•ด ๋งˆ์Šคํฌ๋ฅผ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค. ์ด๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ๊ฐ์ฒด๋ฅผ ํ”„๋ ˆ์ž„ ๋‹จ์œ„๋กœ ์ถ”์ ํ•ฉ๋‹ˆ๋‹ค.
Users can select object colors in an image/video, and SAM generates masks for clusters of similar pixels to track them frame-by-frame.


๐Ÿ’ก ํ”„๋กœ์ ํŠธ ํ™œ์šฉ ๋ฐ ํ™•์žฅ ๋ฐฉํ–ฅ | Notes and Potential Applications

  • ๋ณธ ํ”„๋กœ์ ํŠธ๋Š” ์‹ค์‹œ๊ฐ„ ๊ฐ์‹œ ์‹œ์Šคํ…œ, ์ž๋™ ๋กœ๋ด‡ ์‹œ๊ฐ ์ฒ˜๋ฆฌ, ์ƒ‰์ƒ ๊ธฐ๋ฐ˜ ํ’ˆ์งˆ ๊ฒ€์‚ฌ ๋“ฑ์˜ ์‚ฐ์—…์  ์‘์šฉ์— ์ ํ•ฉํ•ฉ๋‹ˆ๋‹ค.
    This project is suitable for industrial applications like real-time surveillance, robotic vision, and color-based quality inspection.
  • SAM์€ prompt-free ๋ถ„ํ•  ๊ธฐ๋Šฅ์„ ํ†ตํ•ด ์˜์ƒ ๋‚ด ๋‹ค์–‘ํ•œ ์‚ฌ๋ฌผ ์ธ์‹ ๋ฐ ๋ถ„๋ฆฌ ์ž‘์—…์—์„œ ๊ฐ•๋ ฅํ•œ ์„ฑ๋Šฅ์„ ๋ณด์ด๋ฉฐ, ์ด ํ”„๋กœ์ ํŠธ๋Š” ํ•ด๋‹น ๋ชจ๋ธ์˜ ์œ ์—ฐ์„ฑ์„ ์ƒ‰์ƒ ๋ถ„์„ ์ž‘์—…์— ์ ‘๋ชฉํ•œ ๊ฒƒ์ž…๋‹ˆ๋‹ค.
    SAM’s prompt-free segmentation offers high flexibility, and this project integrates it with color analysis for robust tracking.
  • ๋‹ค์–‘ํ•œ ์ƒ‰์ƒ ๋ฒ”์œ„, ํ•ด์ƒ๋„, ์กฐ๋ช… ์กฐ๊ฑด์—์„œ๋„ ์ž˜ ์ž‘๋™ํ•˜๋„๋ก ๊ตฌ์„ฑ๋˜์–ด ์žˆ์œผ๋ฉฐ, ์ฝ”๋“œ ์ผ๋ถ€๋Š” ์‚ฌ์šฉ์ž ์ง€์ •์œผ๋กœ ํ™•์žฅ ๊ฐ€๋Šฅํ•ฉ๋‹ˆ๋‹ค.
    It is robust to color variation, resolution, and lighting, and can be easily extended for custom use cases.

โœ… ๋ณธ ํ”„๋กœ์ ํŠธ๋Š” ์—ฐ๊ตฌ ๋ฐ ํ”„๋กœํ† ํƒ€์ดํ•‘์„ ์œ„ํ•œ ์‹œ์ž‘์ ์œผ๋กœ ํ™œ์šฉ๋˜๋ฉฐ, ํ–ฅํ›„ ๋‹ค์Œ๊ณผ ๊ฐ™์€ ๋ฐฉํ–ฅ์œผ๋กœ ํ™•์žฅ๋  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค:
This project serves as a prototype and can be extended to:

  • ์‹ค์‹œ๊ฐ„ ์˜์ƒ ์ŠคํŠธ๋ฆฌ๋ฐ ์ฒ˜๋ฆฌ (Real-time video streaming)
  • ์‚ฌ์šฉ์ž ๋งˆ์šฐ์Šค ํด๋ฆญ ๊ธฐ๋ฐ˜ ๊ฐ์ฒด ์ถ”์  (Click-to-track systems)
  • ๋กœ๋ด‡ ์ œ์–ด ๋˜๋Š” ํ–‰๋™ ํŠธ๋ฆฌ๊ฑฐ์šฉ ๊ฐ์ฒด ์ธ์‹ (Object-triggered robotic actions)

๐Ÿ”— Github

https://github.com/utf-404/sam2_image_video_detection

 

 

 

 

728x90