Trident Networks: Novel State-of-the-art Model for Object Detection

Researchers from the University of Chinese Academy of Sciences have proposed a new state-of-the-art object detection method based on scale-aware deep neural networks.

In their novel paper, named “Scale-aware Trident Networks for Object Detection”, the group proposes a new type of neural network architecture for object detection that takes scale variation into account. According to them, this variation is one of the major challenges in object detection and has not been addressed properly in the past.

To demonstrate and also investigate the effect of scale variation on object detection algorithms performance, researchers conduct several controlled experiments modifying the receptive fields of convolutional neural networks. Using the findings from these experiments a novel network architecture was proposed called Trident Network.

Trident Network generates scale-aware feature maps efficiently by trident blocks with different receptive fields.

The proposed network has three branches that share the same parameters but differ in the dilatation rate i.e. the receptive fields. During training, the network provides different scale objects for each branch and the final proposals or detections from the branches are combined using Non-maximum Suppression (NMS).

The proposed architecture is in fact simply a backbone network where each of the branches can include different region proposal networks and object detection networks. Researchers include RPN and Fast R-CNN heads as shared detectors among branches for conducting the experiments.

Architecture of the backbone Trident Network.

Trident Network achieves state-of-the-art results on single model object detection with the COCO test dataset. The performance of the network was also measured against other scale handling methods and the proposed network significantly outperforms all other methods.

More details about the Trident Network architecture can be found in the official paper. The implementation of the method was open-sourced and it’s available here.