Abstract:Text in natural-scene images often present characteristics of complex backgrounds, varied shapes, multiple orientations and changing illumination. In order to improve detection performance for scene text, particularly irregular text, we propose the feature guided adaptive network (FGANet), an irregular-scene text detection network based on feature filtering and adaptive fusion mechanisms. In specific, FGANet designs a module that utilizes dilated convolution to enlarge the receptive and enhance the network′s feature representation capability. Its adaptive feature fusion module integrates deep semantic information with shallow detailed information, enabling stronger text-awareness. Experiments results show that for scene text detection, FGANet achieves notable improvements in F-score over comparative methods on four benchmark datasets: ICDAR2015, CTW1500, MSRA-TD500, and TotalText, with gains of 2.4%, 1.3%, 1.8%, and 1.4%, respectively.