💡 SAM2 알아보기

Notice

Recent Posts

Recent Comments

Link

250x250

« 2025/07 »
일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

Tags more

Archives

Today

Total

관리 메뉴

UTF-404

💡 SAM2 알아보기 본문

private study

💡 SAM2 알아보기

UTF-404 2025. 2. 2. 14:09

728x90

SAM2 (Segment Anything Model2)

: Meta에서 개발한 객체분할 (Segmentation) 모델로, 이미지를 기반으로 프롬프트를 통해 객체의 마스크(mask)를 생성하는 AI 모델이다. 모델은 이미지에서 관심 객체를 정확히 식별하고 분할하기 위해 설계되었으며, 이전 버전인 SAM의 업그레이드된 형태이다.

📍 SAM2 주요 특징

이미지 임베딩(Image Embedding) :

입력 이미지를 고차원 벡터 표현으로 변환한다.
해당 임베딩은 다양한 프롬프트(점, 박스, 기존 마스크)를 기반으로 객체를 빠르고 정확하게 분할할 수 있도록 한다.

다양한 프롬프트 지원 :

점(Point) : 특정 객체의 위치를 지정하는 점을 프롬프트로 사용
박스(Box) : 객체를 포함하는 사각형 범위를 지정
마스크(Mask) : 이전에 생성된 마스크를 기반으로 후속 작업을 수행

효율성 :

프롬프트를 추가할 때마다 새로운 임베딩을 생성할 필요 없이 기존 임베딩을 재사용하여 빠르고 효율적인 분할 작업을 수행한다.

고품질 마스크 생성 :

다양한 크기와 복잡도를 가진 객체를 정확히 분할할 수 있다.

SAM2 작동 방식

이미지 설정:
- set_image 메서드를 사용해 이미지를 설정하고, 해당 이미지의 임베딩을 생성한다.
프롬프트 입력:
- predict 메서드로 점, 박스, 또는 마스크를 입력해 객체를 분할한다.
결과 반환:
- 모델은 입력 프롬프트를 기반으로 객체 마스크를 생성하여 반환한다.

SAM2 사용 시 유의 사항

하드웨어 요구사항: 대규모 모델이므로 GPU가 필요하며, A100이나 L4 GPU를 사용하는 것이 권장된다.
메모리 관리: 이미지 크기와 프롬프트 복잡도에 따라 메모리 부족 문제가 발생할 수 있으므로 적절히 조정해야 한다.

SAM2의 주요 활용 사례

컴퓨터 비전 연구: 이미지 분할 작업에 대한 연구 및 새로운 모델 개발.
의료 영상 분석: MRI, CT 스캔 등의 의료 이미지를 분석하여 병변이나 특정 조직을 분할.
로봇 비전: 로봇이 특정 객체를 인식하고 작업에 사용할 수 있도록 분할.
자율주행: 도로 환경에서 보행자, 차량, 표지판 등 객체를 분할.
크리에이티브 도구: 이미지 편집에서 특정 영역을 선택하거나 분리.

📍SAM2 설치 및 활용

Meta에서 제공하는 demo code file(Github)이다. 필자도 여기서 제공하는 데모코드와 readme를 읽어보면서 활용해 보았다.

GUI 모드로 실행해보고 싶다면 아직 데모 버전이지만 두 번째 링크로 접속해서 실행해 볼 수 있는데 정말 혁신적이다..!!

https://github.com/facebookresearch/sam2

GitHub - facebookresearch/sam2: The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2

The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th...

github.com

https://ai.meta.com/sam2/

Meta Segment Anything Model 2

Access our research Open innovation To enable the research community to build upon this work, we’re publicly releasing a pretrained Segment Anything 2 model, along with the SA-V dataset, a demo, and code. Download the model Highlights We are providing tr

ai.meta.com

Meta Segment Anything Model 2

ai.meta.com

지금부터는 필자가 실제로 사용한 코드를 한번 나열해볼까 한다.

1. SAM2를 활용해서 이미지 안에 객체를 모두 세그멘테이션 해볼 수 있는 코드이다.

import os
import torch
import numpy as np
import matplotlib.pyplot as plt
from PIL import Image
from sam2.build_sam import build_sam2
from sam2.sam2_automatic_mask_generator import SAM2AutomaticMaskGenerator

# 환경 변수 설정
os.environ["PYTORCH_CUDA_ALLOC_CONF"] = "expandable_segments:True"

# GPU 메모리 캐시 비우기
torch.cuda.empty_cache()

# GPU/CPU 디바이스 설정
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(f"Using device: {device}")

# 이미지 로드 및 크기 축소
image = Image.open('images/car2.jpg').convert("RGB")
image = image.resize((image.width // 2, image.height // 2))
image = np.array(image)

# 모델 생성
sam2_checkpoint = "checkpoints/sam2.1_hiera_large.pt"
model_cfg = "configs/sam2.1/sam2.1_hiera_l.yaml"
sam2 = build_sam2(model_cfg, sam2_checkpoint, device=device, apply_postprocessing=False)

# 마스크 생성기 초기화
mask_generator = SAM2AutomaticMaskGenerator(
    model=sam2,
    points_per_side=16,
    points_per_batch=32,
    pred_iou_thresh=0.5,
    stability_score_thresh=0.85,
    min_mask_region_area=50,
)

# 마스크 생성
masks = mask_generator.generate(image)
print(f"Number of masks generated: {len(masks)}")

# 마스크별 정보 출력
for i, ann in enumerate(masks):
    print(f"\MASK {i+1}:")
    print(f"  영역 크기: {ann['area']}")                                #마스크가  차지하는 픽셀의 개수
    print(f"  예측된 IOU: {ann['predicted_iou']}")                      #마스크와 실체 객체 간의 일치 비율
    print(f"  세그멘테이션 크기: {ann['segmentation'].shape}")              #마스크를 나타내는 2D 배열의 크기(이미지 내 겍체의 영역 크기)

# 결과 시각화
def show_anns(anns):
    if len(anns) == 0:
        print("No annotations found.")
        return
    sorted_anns = sorted(anns, key=(lambda x: x['area']), reverse=True)
    img = np.ones((image.shape[0], image.shape[1], 4))
    img[:, :, 3] = 0
    for ann in sorted_anns:
        m = ann['segmentation']
        color_mask = np.concatenate([np.random.random(3), [0.5]])
        img[m] = color_mask
    plt.figure(figsize=(20, 20))
    plt.imshow(image)
    plt.imshow(img)
    plt.axis('off')
    plt.show()

show_anns(masks)

2. 이미지 상에 필요한 부분을 특정하여 픽셀(좌표값)로 지정하여 흑색으로 칠하고 나머지 부분을 백색으로 나타낼 수 있도록 하였다.

여기서 사용한 모델은 SAM2 모델이 아니라 SAM(1) model이다.

실제로 뭐가 다른지 구분해 보고자 여러 시도를 해보다 SAM(1)까지 가버렸는데.. 확실히 SAM2의 성능이 압도적이구나 라는 생각을 하게 되었다.

import torch
from PIL import Image
import numpy as np
from segment_anything import sam_model_registry, SamPredictor

# 모델 설정
model_type = "vit_b"  # 사용할 모델: vit_b, vit_l, vit_h
model_path = "models/sam_vit_b.pth"
sam_model = sam_model_registry[model_type](checkpoint=model_path)

# 예측기 초기화
sam_predictor = SamPredictor(sam_model)

# 입력 이미지 로드 및 변환
input_image_path = "images/car2.jpg"
image = Image.open(input_image_path).convert("RGB")
image_np = np.array(image)
sam_predictor.set_image(image_np)

# 타이어 좌표를 수동으로 지정
input_points = np.array([
    [300, 600],  # 첫 번째 객체 (x, y)
    [600, 800],  # 두 번째 객체 (x, y)
    [900, 600],  # 세 번째 객체 (x, y)
])
input_labels = np.array([1, 1, 1])  # 각 좌표의 라벨: 1 (타겟)

# 세그멘테이션 예측
masks, scores, logits = sam_predictor.predict(input_points, input_labels)

# 각 타이어에 대해 결과 저장
for i, mask in enumerate(masks):
    output_mask_path = f"output/output_tire_mask_{i + 1}.png"  # 결과 마스크 저장 경로
    Image.fromarray((mask * 255).astype("uint8")).save(output_mask_path)
    print(f"타이어 {i + 1} 세그멘테이션 완료!!!!!! {output_mask_path}에 저장되었습니다.")

3. 두 번째 모드를 그냥 color 버전으로 만들어본 버전이다.

import torch
from PIL import Image
import numpy as np
from segment_anything import sam_model_registry, SamPredictor
import random

# 모델 설정
model_type = "vit_b"  # 사용할 모델: vit_b, vit_l, vit_h
model_path = "models/sam_vit_b.pth"
sam_model = sam_model_registry[model_type](checkpoint=model_path)

# 예측기 초기화
sam_predictor = SamPredictor(sam_model)

# 입력 이미지 로드 및 변환
input_image_path = "images/car2.jpg"  # 업로드된 이미지 경로
image = Image.open(input_image_path).convert("RGB")  # RGB 형식으로 변환
image_np = np.array(image)  # numpy 배열로 변환
sam_predictor.set_image(image_np)  # 이미지 설정

# 타이어 중심부 좌표 (포인트 세그멘테이션에 사용)
input_points = np.array([
    [434, 296],  
    [423, 562],  
    [447, 751],  
])
input_labels = np.array([1, 1, 1])  # 각 좌표의 라벨: 1 (타겟)

# 세그멘테이션 예측 (포인트 기반)
masks, scores, logits = sam_predictor.predict(point_coords=input_points, point_labels=input_labels)

# 컬러 마스크를 적용한 이미지 생성
output_image = image_np.copy()  # 원본 이미지 복사

# 각 마스크에 대해 랜덤 색상 적용
for i, mask in enumerate(masks):
    # 랜덤 색상 생성 (R, G, B)
    color = np.array([random.randint(0, 255) for _ in range(3)])  # 랜덤 색상 생성
    segmentation = mask  # 마스크 배열 (True/False 값)

    # 마스크 영역에 색상 적용
    output_image[segmentation] = color

# 컬러 마스크가 적용된 최종 이미지 저장
final_output_path = "output/colored_mask_output.png"
Image.fromarray(output_image).save(final_output_path)
print(f"컬러 마스크가 적용된 결과가 {final_output_path}에 저장되었습니다.")import torch
from PIL import Image
import numpy as np
from segment_anything import sam_model_registry, SamPredictor
import random

# 모델 설정
model_type = "vit_b"  # 사용할 모델: vit_b, vit_l, vit_h
model_path = "models/sam_vit_b.pth"
sam_model = sam_model_registry[model_type](checkpoint=model_path)

# 예측기 초기화
sam_predictor = SamPredictor(sam_model)

# 입력 이미지 로드 및 변환
input_image_path = "images/tire_03.jpg"  # 업로드된 이미지 경로
image = Image.open(input_image_path).convert("RGB")  # RGB 형식으로 변환
image_np = np.array(image)  # numpy 배열로 변환
sam_predictor.set_image(image_np)  # 이미지 설정

# 타이어 중심부 좌표 (포인트 세그멘테이션에 사용)
input_points = np.array([
    [434, 296],  
    [423, 562],  
    [447, 751],  
])
input_labels = np.array([1, 1, 1])  # 각 좌표의 라벨: 1 (타겟)

# 세그멘테이션 예측 (포인트 기반)
masks, scores, logits = sam_predictor.predict(point_coords=input_points, point_labels=input_labels)

# 컬러 마스크를 적용한 이미지 생성
output_image = image_np.copy()  # 원본 이미지 복사

# 각 마스크에 대해 랜덤 색상 적용
for i, mask in enumerate(masks):
    # 랜덤 색상 생성 (R, G, B)
    color = np.array([random.randint(0, 255) for _ in range(3)])  # 랜덤 색상 생성
    segmentation = mask  # 마스크 배열 (True/False 값)

    # 마스크 영역에 색상 적용
    output_image[segmentation] = color

# 컬러 마스크가 적용된 최종 이미지 저장
final_output_path = "output/colored_mask_output.png"
Image.fromarray(output_image).save(final_output_path)
print(f"컬러 마스크가 적용된 결과가 {final_output_path}에 저장되었습니다.")

결론

필자의 생각이지만 세상이 정말 빠르게 변하고 있다.. 정말 대단한 기술들이 빠르게 발전하며 세상으로 나오는 거 같다.

요즘 새로운 다양한 기술들을 실제로 사용해 보면서 내가 이런 것들을 개발할 수도 있어야겠지만.. 이런 기술들을 어떻게 활용해서 내가 개발하는 서비스에 적용해 볼 수 있을까 라는 생각을 하게 된다. 아쉽다면 졸업발표 때 이 객체 인식 기술을 도입해 사용해 볼 수 있었다면 얼마나 좋았을까 라는 생각을 해보면서 이번 글을 마무리해볼까 한다.

다음에는 더 좋은 글로 찾아볼 수 있으면 좋겠다.

728x90

'private study' 카테고리의 다른 글

SAM2 다양하게 이용해보기!! (6)	2025.04.11
💡 SAM2 직접 적용해보기 (2)	2025.02.04
mac 환경에서 tkinter gui 오류 해결방법!! (3)	2024.05.02
Raspberry Pi 설치해보기 (0)	2024.05.02
Docker를 활용한 Ubuntu 설치!! (0)	2024.03.14

'private study' Related Articles

UTF-404

💡 SAM2 알아보기 본문

💡 SAM2 알아보기

SAM2 (Segment Anything Model2)

📍 SAM2 주요 특징

이미지 임베딩(Image Embedding) :

다양한 프롬프트 지원 :

효율성 :

고품질 마스크 생성 :

SAM2 작동 방식

SAM2 사용 시 유의 사항

SAM2의 주요 활용 사례

📍SAM2 설치 및 활용

결론

'private study' 카테고리의 다른 글

티스토리툴바