物体检测|Point Linking Network for Object Detection论文解读

lamb

Point Linking Network for Object Detection

可以联想到multi-person pose estimation problem
注：本地typora和团队博客有些不兼容，之后会附上自己博客的链接

概述：

与Faster-R-CNN, YOLO and SSD将边界框视作一个整体不同，PLN将回归后的bounding-box的corner/center points以及links，用全卷积连接，之后再将points及links对应回多个bounding boxes，最后融合多个bounding boxes即得到物体检测的结果。PLN对重叠区及变长宽比的情况，效果很好。

原文：we regress the corner/center points of bounding-box and their links using a fully convolutional network; then we map the corner points and their links back to multiple bounding boxes; finally an object detection result is obtained by fusing the multiple bounding boxes.

预测center-corner point pairs：OC_1,OC_2,O_C3和OC_4，再根据图像的网格得到点的坐标，显然我们只要得到一个center-corner point pair就能预测出bounding box。

其它网络的缺陷：

one-stage忽略了global context

two-stage用固定长度的向量表征多个bounding boxes，未匹配到bounding box的grids，会在训练期间被忽略。原文：The crux of
these methods is representing a various number of bounding
boxes using a fixed length vector (the grids with no matched
bounding box will be ignored during training).

PLN简述

PLN将物体检测问题归为a point detection problem and a point linking problem。（a point linking problem指在points中把同属一个物体的points link起来）。这样做对scale和ratio更灵活；只要得到一个center-corner point pair就可以得到结果，可加入投票机制提高效果；利用到了局部的线索。在实际应用中，整个网络是一个single deep network，point detection and point linking 共用了一个损失函数（A grid cell is responsible for predicting the center point inside includes its confidence, x-offset, y-offset and link, as well as the corner point inside also includes its confidence, x-offset, y-offset
and link）。从DPM（Deformable Part Model）中汲取灵感，基于部分的物体进行检测。

PLN详述

网络设计

上图的grids为S×S个，用B表示检测物体的数量，中心点有B个，右上角的点有B个。即 $\begin{array}{l}{\text { The } 1 \text { st to the } B \text { -th predictions are for center points }} \ {\text { and the }(B+1) \text { -th to the }(2 B) \text { -th predictions are for cor- }} \ {\text { ner points. Each prediction contains four items: } P_{i j}, Q_{i j}} \ {\left[x_{i j}, y_{i j}\right],\left[L_{i j}^{x}, L_{i j}^{y}\right], \text { where } i \in\left[1, \cdots, S^{2}\right] \text { is the spatial }} \ {\text { index and } j \in[1, \cdots, 2 \times B] \text { is the point index. }}\end{array}$

其中 $P_{ij}$ 表示point在grid中存在的可能性； $Q_{ij}$ 表示对应到分类的可能性； $\left[x_{i j}, y_{i j}\right]$ 表示精调position（准确的position到网格顶点到距离)； $\left[L_{i j}^{x}, L_{i j}^{y}\right]$ 表示点之间的link， $L_{i j}^{x}$ 和 $L_{i j}^{y}$ 为长度为S的向量， $\left[L_{i j}^{x}, L_{i j}^{y}\right]$ 表示对应（i,j）的点与 $\arg \max {k} L(k){i j}^{x}$ 行 $\arg \max {k} L(k){i j}^{y}$ 列联系到一起。

原文：

Loss Function

$\text { Loss }{i j}=\mathbb{1}{i j}^{\mathrm{pt}} \operatorname{Loss}{i j}^{\mathrm{pt}}+\mathbb{1}{i j}^{\mathrm{nopt}} \operatorname{Loss}_{i j}^{\mathrm{nopt}}$

上标表示cell grid中是否包含point，其中，

$\begin{array}{c}{\operatorname{Loss}{i j}^{\mathrm{pt}}=\left(P{i j}-1\right)^{2}+w_{\mathrm{class}} \Sigma_{n=1}^{N}\left(Q(n){i j}-Q(\hat{n}){i j}\right)^{2}+} \ {w_{\mathrm{coord}}\left(\left(x_{i j}-\hat{x}{i j}\right)^{2}+\left(y{i j}-\hat{y}{i j}\right)^{2}\right)+} \ {w{\operatorname{link}} \Sigma_{k=1}^{S}\left(\left(L(k){i j}^{x}-L(\hat{k}){i j}^{x}\right)^{2}+\left(L(k){i j}^{y}-L(\hat{k}){i j}^{y}\right)^{2}\right)}\end{array}$

，$\operatorname{Loss}{i j}^{\text { nopt }}=P{i j}^{2}$。

$\operatorname{Loss}_{i j}^{\mathrm{pt}}$依次对应到point existence,classification scores, x-offset, y-offset, and rough location of the linked point。

求和得， $\mathrm{Loss}=\sum_{i=1}^{S^{2}} \sum_{j=1}^{2 B} L_{i j}$

预测

通过计算 $P_{i j n s t}^{\mathrm{obj}}=P_{i j} P_{s t} Q(n){i j} Q(n){s t} \frac{L\left(s_{x} x_{i j}^{x} L\left(s_{y}\right){i j}^{y}+L\left(i{x}\right){s t}^{x} L\left(i{y}\right)_{s t}^{y}\right.}{2}$

得到(i,j) 和 (s,t) 是point pair的概率。

Branches Merging via NMS

合并四个分支，再进行nms。

实验细节

略

结果对比

lamb

博客搭建完成，更新此笔记的正常latex版