Document Recognition: OCR


Overview

While OCR (Optical Character Recognition) has been widely regarded as a solved problem, this is only true if the documents are clean and scanned at very high resolution. OCR performance degrades significantly with even small amounts of noise present in the document image. We aim to overcome this limitation by incorporating modern statistical language modelling techniques into OCR, to produce a more robust system that will be resistant to high levels of noise in the document. In addition, we aim to do this without using a large number of stored font models and instead rely on statistical properties of English language.

Faculty


Graduate Students


References

  • Gary B. Huang, Andrew Kae, Carl Doersch, Erik Learned-Miller
    Bounding the Probability of Error for High Precision Optical Character Recognition
    Journal of Machine Learning Research (JMLR), 2012.
    [pdf] [project]
  • Andrew Kae, Kin Kan, Vijay K Narayanan, Dragomir Yankov
    Categorization of Display Ads using Image and Landing Page Features
    The Third Workshop on Large-scale Data Mining: Theory and Applications'11 (LDMTA'11), in conjunction with SIGKDD2011.
    [pdf]
  • Andrew Kae, David A. Smith, and Erik Learned-Miller
    Learning on the Fly: A font-free approach towards multilingual OCR
    International Journal on Document Analysis and Recognition (IJDAR)
    [pdf] [Springer]
  • Andrew Kae, Gary Huang, Carl Doersch, and Erik Learned-Miller
    Improving State-of-the-Art OCR through High-Precision Document-Specific Modeling
    Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2010.
    [pdf]
  • Andrew Kae, Gary Huang, and Erik Learned-Miller
    Bounding the Probability of Error for High Precision Recognition.
    Technical Report UM-CS-2009-031, Dept. of Computer Science, University of Massachusetts, Amherst, 2009.
    [pdf] [arxiv.org] [project]
  • Andrew Kae and Erik Learned-Miller.
    Learning on the fly: Font free approaches to difficult OCR problems.
    Proceedings of the International Conference on Document Analysis and Recognition (ICDAR), 2009.
    [pdf]
  • Michael Wick, Michael G. Ross and Erik Learned-Miller.
    Context-Sensitive Error Correction: Using Topic Models to Improve OCR.
    Proceedings of the International Conference on Document Analysis and Recognition (ICDAR), 2007.
    [pdf]
  • Gary C. Huang, Erik Learned-Miller, and Andrew McCallum.
    Cryptogram Decoding for OCR using Numerization Strings.
    Proceedings of the International Conference on Document Analysis and Recognition (ICDAR), 2007.
    [pdf]