Researchers have released a new updated version of the popular YOLO object detection neural network which achieves state-of-the-art results on the MS-COCO dataset, running at real-time speed of more than 65 FPS.
The new model, called YOLO-v4 significantly outperforms existing methods in both detection performance and speed. In the paper “YOLOv4: Optimal Speed and Accuracy of Object Detection”, researchers describe their search for a so-called “fast operating” object detector which can be easily trained and deployed in production systems. They mention that their main goal was to optimize detector neural networks for parallel computations and they propose several different architectures and architectural choices after carefully studying the effects on performance of different detector features proposed in the past.
After exploring a large number of improvements over several state-of-the-art models and components, their final model YOLO-v4 ended up consisting of CSPDarknet53 as a backbone, SPP, and PAN models as neck and YOLO-v3 as a head. Inside those components, the new model includes a lot of already proven features such as CutMix and Mosaic data augmentation, DropBlock regularization, Mish activation, Self-adversarial training, and many others.
The models were tested using both ImageNet and MS-COCO object detection datasets, and the influence of different features was studied for all the different models. The results of the evaluations show that YOLO-v4 being located on the Pareto optimality curve outperforms all other methods in both speed and accuracy measured in mAP.