Researchers from Facebook AI have presented a novel image segmentation method that can produce high-quality, precise segmentation masks.
The method is based on a new neural network module called PointRend, that can generate point-based segmentation predictions that help produce more precise segmentation results, especially around objects’ edges.
PointRend employs an iterative subdivision algorithm inspired by image rendering in computer graphics. In their paper, researchers argue that such an analogy between image segmentation and occupancy grids in rendering can be leveraged to build a neural network model that will perform better and output higher-resolution segmentation masks.
The proposed module takes one or more CNN output feature maps, defined over a regular square grid that is several times coarser than the original image resolution. Using a point selection strategy, the algorithm first chooses a small number of points on the grid. For each of these points, a point-wise feature representation is extracted using bilinear interpolation of the coarse grid. Finally, a small neural network is trained to predict a label from the point-wise feature representation.
The PointRend module can be successfully plugged into any CNN model for both instance and semantic segmentation. Researchers applied PointRend on top of existing state-of-the-art models and showed improved segmentation performance. The evaluations of the method showed that it produces much sharper edges in the output segmentation masks and that it can produce high-resolution outputs in an efficient way.
The implementation of PointRend was open-sourced and it is available on Github. The paper can be found here.