Hierarchical Aggregation for 3D Instance Segmentation
论文信息 HAIS
| 作者 | 作者单位 | 年份 | 会议/期刊名 | 引用量 | 研究方向 | 方法 | 代码 | 备注 | 
|---|---|---|---|---|---|---|---|---|
| Shaoyu Chen, Wenyu Liu | 华科&地平线 | 2021 | ICCV | 1 | 3D点云实例分割 | center predict based method | code | ScanNet SOTA | 
3D点云实例分割SOTA榜
在Paperwithcode上找了3D点云分割的SOTA,首先实例分割的数据集大家普遍用的是两个场景的数据集 ScanNet 和 S3DIS,后续自己也计划在这两个数据集上开展自己的实验
Benchmark

目前来看效果比较好的工作:
- Instance Segmentation in 3D Scenes using Semantic Superpoint Tree Networks (SSTNet) ICCV 2021 
- Hierarchical Aggregation for 3D Instance Segmentation (HAIS) ICCV 2021 
- Learning Gaussian Instance Segmentation in Point Clouds (GICN) 2020年工作,点数和Pointgroup差不多 
- Point Group 2020年工作 
ScanNet

    另外Scan Net还有一个单独的榜单
    http://kaldir.vc.in.tum.de/scannet_benchmark/semantic_instance_3d
S3DIS

TL;DR
在PointGroup的基础上提出了改进版,采用了hierarchical aggregation (首先通过fixed bandwidth 聚类算法产生一系列候选instance set,然后再对候选实例点集进行合并)
速度上相比PointGroup有了一些提升,无需NMS实现SOTA性能,相比PointGroup提升4个点
Background
- 2D实例分割 - Top-Down Method- two-stage (bbox→mask)
 
- Bottom-Up Method- Clustering based
 
 
- Top-Down Method
- Directly clustering difficult reasons: - A point cloud usually contains a large number of points (待聚类的样本多)
- The number of instances in a point cloud has large variations for different 3D scenes (每个聚类簇含有的样本数量差别大)
- The sizes of instances vary significantly (聚类簇内部分布不整齐)
- Each point has a very weak feature, i.e., 3D coordinate and color (聚类特征数少)
 
- Deep Learning on Point Clouds - Proposal-based Instance Segmentation- 2D: Mask RCNN
- GSPN
- 3D-SIS
- 3D-BoNet
- 3D-MPA
- GICN
 
- Clustering-based Instance Segmentation- SGPN
- JSIS3D
- MTML
- OccuSeg
- PointGroup
 
 
- Proposal-based Instance Segmentation
Method
整个Pipeline称得上是PointGroup的魔改版

Backbone: 3D sparse CNN
Hierarchical aggregation → Point aggretation + Set aggretation
We first aggregate points to sets with low bandwidth to avoid over-segmentation and then set aggregation with dynamic bandwidth is adopted to form complete instances.
Set aggregation may absorb noisy point sets into predictions, making the aggregated instances over-complete.
Sub-network→ outlier filtering and mask quality scoring
Outline
- point-wise prediction network → extracts features from point clouds and predicts point-wise semantic labels and center shift vec- 
 tors.
- point aggregation module → preliminary instance predictions 
- set aggregation module → expands incomplete instances to cover missing parts 
- intrainstance prediction network → smooths instances to filter out outlier 
1. Point-wise Prediction Network
Similar with PointGroup(Backbone:3D-Unet)
- Semantic label prediction (CE Loss, 2 layer) 
- Center Shift Vector Prediction (Smooth L1 loss) $\delta x$ - Point Group L1 loss+cosine loss: 在局部最优处不可导,容易产生震荡,余弦loss同样也会有问题
- Why not L2 loss: 离群点梯度大,梯度爆炸?,但优势是接近局部最优时梯度小,容易收敛
 

2. Point Aggregation

- ignore background label (floor, wall) 
- Algorithm: similar with PointGroup 
- shrink radius size (avoid over segmentation → stone fragement) 
3. Set Aggregration
这个是本文提出的一个新的步骤,目的是合并主体instance和其待合并的碎片

碎片instance和主体instance合并规则:
- 两者属于同一类 
- 两者的几何中心应该小于一个值(这个值是统计得到的) 


4. Intra-instance Prediction Network
在Score的基础上做了改进
- 新网络首先预测一个mask去 distinguish the instance foreground and background

- 然后预测mask与best matched gt的iou(CE Loss)


存在的问题
- 输入是一个instance(ShapeNet数据集上不友好)
5. Multi-task Training

其他特点:
- NMS Free
Experience
Result

Ablation Study



Comparsion on inference time

Qualitative results on ScanNet v2

 
         
                   
                   
                   
          