Hierarchical Aggregation for 3D Instance Segmentation
论文信息 HAIS
作者 | 作者单位 | 年份 | 会议/期刊名 | 引用量 | 研究方向 | 方法 | 代码 | 备注 |
---|---|---|---|---|---|---|---|---|
Shaoyu Chen, Wenyu Liu | 华科&地平线 | 2021 | ICCV | 1 | 3D点云实例分割 | center predict based method | code | ScanNet SOTA |
3D点云实例分割SOTA榜
在Paperwithcode上找了3D点云分割的SOTA,首先实例分割的数据集大家普遍用的是两个场景的数据集 ScanNet 和 S3DIS,后续自己也计划在这两个数据集上开展自己的实验
Benchmark
目前来看效果比较好的工作:
Instance Segmentation in 3D Scenes using Semantic Superpoint Tree Networks (SSTNet) ICCV 2021
Hierarchical Aggregation for 3D Instance Segmentation (HAIS) ICCV 2021
Learning Gaussian Instance Segmentation in Point Clouds (GICN) 2020年工作,点数和Pointgroup差不多
Point Group 2020年工作
ScanNet
    另外Scan Net还有一个单独的榜单
    http://kaldir.vc.in.tum.de/scannet_benchmark/semantic_instance_3d
S3DIS
TL;DR
在PointGroup的基础上提出了改进版,采用了hierarchical aggregation (首先通过fixed bandwidth 聚类算法产生一系列候选instance set,然后再对候选实例点集进行合并)
速度上相比PointGroup有了一些提升,无需NMS实现SOTA性能,相比PointGroup提升4个点
Background
2D实例分割
- Top-Down Method
- two-stage (bbox→mask)
- Bottom-Up Method
- Clustering based
- Top-Down Method
Directly clustering difficult reasons:
- A point cloud usually contains a large number of points (待聚类的样本多)
- The number of instances in a point cloud has large variations for different 3D scenes (每个聚类簇含有的样本数量差别大)
- The sizes of instances vary significantly (聚类簇内部分布不整齐)
- Each point has a very weak feature, i.e., 3D coordinate and color (聚类特征数少)
Deep Learning on Point Clouds
- Proposal-based Instance Segmentation
- 2D: Mask RCNN
- GSPN
- 3D-SIS
- 3D-BoNet
- 3D-MPA
- GICN
- Clustering-based Instance Segmentation
- SGPN
- JSIS3D
- MTML
- OccuSeg
- PointGroup
- Proposal-based Instance Segmentation
Method
整个Pipeline称得上是PointGroup的魔改版
Backbone: 3D sparse CNN
Hierarchical aggregation → Point aggretation + Set aggretation
We first aggregate points to sets with low bandwidth to avoid over-segmentation and then set aggregation with dynamic bandwidth is adopted to form complete instances.
Set aggregation may absorb noisy point sets into predictions, making the aggregated instances over-complete.
Sub-network→ outlier filtering and mask quality scoring
Outline
point-wise prediction network → extracts features from point clouds and predicts point-wise semantic labels and center shift vec-
tors.point aggregation module → preliminary instance predictions
set aggregation module → expands incomplete instances to cover missing parts
intrainstance prediction network → smooths instances to filter out outlier
1. Point-wise Prediction Network
Similar with PointGroup(Backbone:3D-Unet)
Semantic label prediction (CE Loss, 2 layer)
Center Shift Vector Prediction (Smooth L1 loss) $\delta x$
- Point Group L1 loss+cosine loss: 在局部最优处不可导,容易产生震荡,余弦loss同样也会有问题
- Why not L2 loss: 离群点梯度大,梯度爆炸?,但优势是接近局部最优时梯度小,容易收敛
2. Point Aggregation
ignore background label (floor, wall)
Algorithm: similar with PointGroup
shrink radius size (avoid over segmentation → stone fragement)
3. Set Aggregration
这个是本文提出的一个新的步骤,目的是合并主体instance和其待合并的碎片
碎片instance和主体instance合并规则:
两者属于同一类
两者的几何中心应该小于一个值(这个值是统计得到的)
4. Intra-instance Prediction Network
在Score的基础上做了改进
- 新网络首先预测一个mask去 distinguish the instance foreground and background
- 然后预测mask与best matched gt的iou(CE Loss)
存在的问题
- 输入是一个instance(ShapeNet数据集上不友好)
5. Multi-task Training
其他特点:
- NMS Free