Note: Two-Stage Underwater Object Detection Network Using Swin Transformer

王柏鈞
機器學習圖鑑
Published in
2 min readNov 25, 2022

--

IEEE, JIA LIU 1 , SHUANG LIU 1 , SHUJUAN XU 1 , (Member, IEEE), AND CHANGJUN ZHOU 2, 2022, Nov, 04,

source: https://ieeexplore.ieee.org/document/9938441

topics

  • Underwater object detection
  • problems with colour offset, low contrast, and target blur in underwater image data
  • propose an underwater object detection algorithm based on Faster R-CNN

challenge

  • challenges in practice, such as poor quality, loss of visibility and weak contrast
  • image degradation will lose many features
  • For example, the colour information of sea urchins, scallops and other creatures are relatively stable
  • Sbut the texture information is easily destroyed

algor. architecture

  1. Swin Transformer is used as the backbone network

我掐指一算這個特徵向量胖的驚人,邊緣裝置要用不容易

2. Adding the path aggregation network

FPN+路徑資料增強,類似在FPN的輸出端嫁接一個Unet

3. OHEM(Online Hard Example Mining)

先用ROI-network前向推論,找出TN、FP的sample進行梯度下降。

也許可以改成focal loss或hard-batch triplet loss

4. ROI pooling is improved to ROI align

看起來就是普通的ROI align,沒發現改變。插值方法還是一樣的雙樣條線插。REF: https://zhuanlan.zhihu.com/p/73138740

Result

  • Faster-RCNN on URPC2018 dataset is improved to 80.54%,
  • basically solve the problem of missed detection and false detection of objects of different sizes in a complex environment.

結論

  • 把特徵萃取換成Resnet或efficientNet,或者把yolo整合進去都可能是改善的方向
  • 對於特徵背景,可以透過標準差的方式濾除一定可信度下不包含物件的資訊。每個場域用boostrap方法統計一次進行初始化是值得嘗試的方法。
swim transformer for 水下影像
URPC2018
path aggregation network
swim-transformer的參數量(還沒算到特徵向量的成本)
效果提升

--

--