标题：Segmentation of sonar imagery using convolutional neural networks and Markov random field
作者：Liu, Peng; Song, Yan
作者机构：[Liu, Peng] Tianjin Univ, Sch Microelect, Tianjin 300072, Peoples R China.; [Song, Yan] Shandong Univ, Inst Marine Sci & Technol, Qingdao 266237, Pe 更多
通讯作者：Song, Yan;Song, Y
通讯作者地址：[Song, Y]Shandong Univ, Inst Marine Sci & Technol, Qingdao 266237, Peoples R China.
来源：MULTIDIMENSIONAL SYSTEMS AND SIGNAL PROCESSING
关键词：Side scan sonar; Image segmentation; Convolutional neural networks;; Ensemble learning; Class imbalance problem; Markov random field
摘要：In this paper, we present a novel method incorporating convolutional neural networks (CNN) into Markov random field (MRF) to automatically segment side scan sonar (SSS) images into object-highlight, object-shadow and sea-bottom reverberation areas. As a widely used ocean survey sensor, SSS provides high-resolution maps of the seafloor. Automatically segmenting SSS in real time can assist the navigation and path-planning of autonomous underwater vehicles. However, for the speckle noise and intensity inhomogeneity in the SSS images, it is difficult to find a robust SSS segmentation method. These facts motivate us to explore efficient CNN architectures to solve these problems. For pixel-level SSS segmentation, to use the context information and the details around a central pixel simultaneously, the CNN with multi-scale inputs (MSCNN) is employed. Besides, to mitigate the impact of the class imbalance problem, two MSCNN training strategies are introduced, which are based on data augmentation and ensemble learning. Furthermore, to take into account the local dependencies of class labels, the results of MSCNN are used to initialize MRF to get the final segmentation maps. Experimental results on real SSS images reveal that the proposed segmentation method outperforms MRF, CNN and semantic segmentation methods such as fully convolutional network and Segnet in segmentation accuracy and generalization performance. Moreover, the efficiency of the proposed method is proved on retinal image dataset.