Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

多标签语义分割任务 multi-label semantic segmentation #2174

Closed
Wulx2050 opened this issue May 30, 2022 · 4 comments
Closed

多标签语义分割任务 multi-label semantic segmentation #2174

Wulx2050 opened this issue May 30, 2022 · 4 comments
Assignees
Labels
enhancement New feature or request

Comments

@Wulx2050
Copy link

PaddleSeg 里面的任务和模型好像都是默认一个像素一个标签,但是有时候会碰到一个像素有多个标签的情况,比如这里:
https://www.kaggle.com/competitions/uw-madison-gi-tract-image-segmentation/overview

这个竞赛是一个多标签语义分割(multi-label semantic segmentation)任务,有些像素同时有两个以上的标签,输出是 Hight×Width×Num_class。Num_class>=2

我没搜索到 PaddleSeg 里面相关的教程和算法,请问我怎么处理这种多标签语义分割任务?

@Wulx2050 Wulx2050 added the enhancement New feature or request label May 30, 2022
@Wulx2050
Copy link
Author

Wulx2050 commented May 30, 2022

而且现在很多数据集的标注文件使用RLE(run-length encoding,游程编码,行程长度编码)的格式,为了方便麻烦加两个编码和解码的函数到 PaddleSeg 中。

import numpy as np

# ref: https://www.kaggle.com/paulorzp/run-length-encode-and-decode
# modified from: https://www.kaggle.com/inversion/run-length-decoding-quick-start
def rle_decode(mask_rle, shape, color=1):
    """ TBD
    
    Args:
        mask_rle (str): run-length as string formated (start length)
        shape (tuple of ints): (height,width) of array to return 
    
    Returns: 
        Mask (np.array)
            - 1 indicating mask
            - 0 indicating background

    """
    # Split the string by space, then convert it into a integer array
    s = np.array(mask_rle.split(), dtype=int)

    # Every even value is the start, every odd value is the "run" length
    starts = s[0::2] - 1
    lengths = s[1::2]
    ends = starts + lengths

    # The image image is actually flattened since RLE is a 1D "run"
    if len(shape)==3:
        h, w, d = shape
        img = np.zeros((h * w, d), dtype=np.float32)
    else:
        h, w = shape
        img = np.zeros((h * w,), dtype=np.float32)

    # The color here is actually just any integer you want!
    for lo, hi in zip(starts, ends):
        img[lo : hi] = color
        
    # Don't forget to change the image back to the original shape
    return img.reshape(shape)

# https://www.kaggle.com/namgalielei/which-reshape-is-used-in-rle
def rle_decode_top_to_bot_first(mask_rle, shape):
    """ TBD
    
    Args:
        mask_rle (str): run-length as string formated (start length)
        shape (tuple of ints): (height,width) of array to return 
    
    Returns:
        Mask (np.array)
            - 1 indicating mask
            - 0 indicating background

    """
    s = mask_rle.split()
    starts, lengths = [np.asarray(x, dtype=int) for x in (s[0:][::2], s[1:][::2])]
    starts -= 1
    ends = starts + lengths
    img = np.zeros(shape[0]*shape[1], dtype=np.uint8)
    for lo, hi in zip(starts, ends):
        img[lo:hi] = 1
    return img.reshape((shape[1], shape[0]), order='F').T  # Reshape from top -> bottom first

# ref.: https://www.kaggle.com/stainsby/fast-tested-rle
def rle_encode(img):
    """ TBD
    
    Args:
        img (np.array): 
            - 1 indicating mask
            - 0 indicating background
    
    Returns: 
        run length as string formated
    """
    
    pixels = img.flatten()
    pixels = np.concatenate([[0], pixels, [0]])
    runs = np.where(pixels[1:] != pixels[:-1])[0] + 1
    runs[1::2] -= runs[::2]
    return ' '.join(str(x) for x in runs)

@juncaipeng juncaipeng self-assigned this May 31, 2022
@juncaipeng
Copy link
Collaborator

PaddleSeg 里面的任务和模型好像都是默认一个像素一个标签,但是有时候会碰到一个像素有多个标签的情况,比如这里: https://www.kaggle.com/competitions/uw-madison-gi-tract-image-segmentation/overview

这个竞赛是一个多标签语义分割(multi-label semantic segmentation)任务,有些像素同时有两个以上的标签,输出是 Hight×Width×Num_class。Num_class>=2

我没搜索到 PaddleSeg 里面相关的教程和算法,请问我怎么处理这种多标签语义分割任务?

你好,目前paddleseg不支持多标签语义分割。如果需要使用,需要手动修改模型结构、loss计算、训练和预测过程。

@alexhmyang
Copy link

PaddleSeg 里面的任务和模型好像都是默认一个像素一个标签,但是有时候会碰到一个像素有多个标签的情况,比如这里: https://www.kaggle.com/competitions/uw-madison-gi-tract-image-segmentation/overview
这个竞赛是一个多标签语义分割(multi-label semantic segmentation)任务,有些像素同时有两个以上的标签,输出是 Hight×Width×Num_class。Num_class>=2
我没搜索到 PaddleSeg 里面相关的教程和算法,请问我怎么处理这种多标签语义分割任务?

你好,目前paddleseg不支持多标签语义分割。如果需要使用,需要手动修改模型结构、loss计算、训练和预测过程。

一年了 还不支持? EISeg 可以标注分割类多标签,我标完了,结果 paddleseg报错不支持?

@MINGtoMING
Copy link
Contributor

@alexhmyang 你好,我目前正在做paddleseg多标签语义分割的支持。多标签语义分割任务中图像上的某个像素点可以同时对应多个类别(普通语义分割是只能指向一个类别),这样的话图像上的不同object之间可能会出现重叠,但EISeg现在并不支持不同object之间可以重叠这种模式。所以可以说一下你是怎样进行重叠区域d标注的吗?还有你的标注好的annotation的格式是什么样的?这样的话我可以提供更加便利的数据读取接口。谢谢!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants