Hierarchical Aggregation for 3D Instance Segmentation

论文信息 HAIS

作者 作者单位 年份 会议/期刊名 引用量 研究方向 方法 代码 备注
Shaoyu Chen, Wenyu Liu 华科&地平线 2021 ICCV 1 3D点云实例分割 center predict based method code ScanNet SOTA

Paper

支撑材料


3D点云实例分割SOTA榜

在Paperwithcode上找了3D点云分割的SOTA,首先实例分割的数据集大家普遍用的是两个场景的数据集 ScanNet 和 S3DIS,后续自己也计划在这两个数据集上开展自己的实验

Benchmark

目前来看效果比较好的工作:

  • Instance Segmentation in 3D Scenes using Semantic Superpoint Tree Networks (SSTNet) ICCV 2021

  • Hierarchical Aggregation for 3D Instance Segmentation (HAIS) ICCV 2021

  • Learning Gaussian Instance Segmentation in Point Clouds (GICN) 2020年工作,点数和Pointgroup差不多

  • Point Group 2020年工作

ScanNet

    另外Scan Net还有一个单独的榜单

    http://kaldir.vc.in.tum.de/scannet_benchmark/semantic_instance_3d

S3DIS


TL;DR

在PointGroup的基础上提出了改进版,采用了hierarchical aggregation (首先通过fixed bandwidth 聚类算法产生一系列候选instance set,然后再对候选实例点集进行合并)

速度上相比PointGroup有了一些提升,无需NMS实现SOTA性能,相比PointGroup提升4个点

Background

  • 2D实例分割

    • Top-Down Method
      • two-stage (bbox→mask)
    • Bottom-Up Method
      • Clustering based
  • Directly clustering difficult reasons:

    • A point cloud usually contains a large number of points (待聚类的样本多)
    • The number of instances in a point cloud has large variations for different 3D scenes (每个聚类簇含有的样本数量差别大)
    • The sizes of instances vary significantly (聚类簇内部分布不整齐)
    • Each point has a very weak feature, i.e., 3D coordinate and color (聚类特征数少)
  • Deep Learning on Point Clouds

    • Proposal-based Instance Segmentation
      • 2D: Mask RCNN
      • GSPN
      • 3D-SIS
      • 3D-BoNet
      • 3D-MPA
      • GICN
    • Clustering-based Instance Segmentation
      • SGPN
      • JSIS3D
      • MTML
      • OccuSeg
      • PointGroup

Method

整个Pipeline称得上是PointGroup的魔改版

Backbone: 3D sparse CNN

Hierarchical aggregation → Point aggretation + Set aggretation

We first aggregate points to sets with low bandwidth to avoid over-segmentation and then set aggregation with dynamic bandwidth is adopted to form complete instances.

Set aggregation may absorb noisy point sets into predictions, making the aggregated instances over-complete.

Sub-network→ outlier filtering and mask quality scoring

Outline

  1. point-wise prediction network → extracts features from point clouds and predicts point-wise semantic labels and center shift vec-
    tors.

  2. point aggregation module → preliminary instance predictions

  3. set aggregation module → expands incomplete instances to cover missing parts

  4. intrainstance prediction network → smooths instances to filter out outlier

1. Point-wise Prediction Network

Similar with PointGroup(Backbone:3D-Unet)

  • Semantic label prediction (CE Loss, 2 layer)

  • Center Shift Vector Prediction (Smooth L1 loss) $\delta x$

    • Point Group L1 loss+cosine loss: 在局部最优处不可导,容易产生震荡,余弦loss同样也会有问题
    • Why not L2 loss: 离群点梯度大,梯度爆炸?,但优势是接近局部最优时梯度小,容易收敛

2. Point Aggregation

  • ignore background label (floor, wall)

  • Algorithm: similar with PointGroup

  • shrink radius size (avoid over segmentation → stone fragement)

3. Set Aggregration

这个是本文提出的一个新的步骤,目的是合并主体instance和其待合并的碎片

碎片instance和主体instance合并规则:

  1. 两者属于同一类

  2. 两者的几何中心应该小于一个值(这个值是统计得到的)

4. Intra-instance Prediction Network

在Score的基础上做了改进

  • 新网络首先预测一个mask去 distinguish the instance foreground and background

  • 然后预测mask与best matched gt的iou(CE Loss)

存在的问题

  • 输入是一个instance(ShapeNet数据集上不友好)

5. Multi-task Training


其他特点:

  1. NMS Free

Experience

Result

Ablation Study

Comparsion on inference time

Qualitative results on ScanNet v2