Text Understanding from Images/Videos

We work on general text understanding pipeline, including

  • Text detection

  • Text recognition

  • Layout analysis

Text Detection

  • X. Yue, Z. Kuang, Z. Zhang, Z. Chen, P. He, Y. Qiao and W. Zhang.
    Boosting up scene text detectors with guided CNN.
    British Machine Vision Conference (BMVC), 2018. (Oral presentation, acceptance rate: 6.5%) [paper]

    Most of existing text detection methods attempt to improve accuracy with sophisticated network design, while paying less attention on speed. We propose a general framework for text detection called Guided CNN to achieve the two goals simultaneously. The proposed model consists of one guidance subnetwork, where a guidance mask is learned from the input image itself, and one primary text detector, where every convolution and non-linear operation are conducted only in the guidance mask. The guidance subnetwork filters out non-text regions coarsely, greatly reducing the computation complexity. At the same time, the primary text detector focuses on distinguishing between text and hard non-text regions and regressing text bounding boxes, achieving a better detection accuracy. We demonstrate that the proposed Guided CNN is not only effective but also efficient with two state-of-the-art methods, CTPN and EAST, as backbones.

Locations of visitors to this page

profile counter Stats

Copyright © 2019 Wei Zhang