Document Recognition: OCR

Overview

While OCR (Optical Character Recognition) has been widely regarded as a solved problem, this is only true if the documents are clean and scanned at very high resolution. OCR performance degrades significantly with even small amounts of noise present in the document image. We aim to overcome this limitation by incorporating modern statistical language modelling techniques into OCR, to produce a more robust system that will be resistant to high levels of noise in the document. In addition, we aim to do this without using a large number of stored font models and instead rely on statistical properties of English language.

Faculty

Graduate Students

References

Gary B. Huang, Andrew Kae, Carl Doersch, Erik Learned-Miller
Bounding the Probability of Error for High Precision Optical Character Recognition
Journal of Machine Learning Research (JMLR), 2012.
[pdf] [project]
Andrew Kae, Kin Kan, Vijay K Narayanan, Dragomir Yankov
Categorization of Display Ads using Image and Landing Page Features
The Third Workshop on Large-scale Data Mining: Theory and Applications'11 (LDMTA'11), in conjunction with SIGKDD2011.
[pdf]
Andrew Kae, David A. Smith, and Erik Learned-Miller
Learning on the Fly: A font-free approach towards multilingual OCR
International Journal on Document Analysis and Recognition (IJDAR)
[pdf] [Springer]
Andrew Kae, Gary Huang, Carl Doersch, and Erik Learned-Miller
Improving State-of-the-Art OCR through High-Precision Document-Specific Modeling
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2010.
[pdf]
Andrew Kae, Gary Huang, and Erik Learned-Miller
Bounding the Probability of Error for High Precision Recognition.
Technical Report UM-CS-2009-031, Dept. of Computer Science, University of Massachusetts, Amherst, 2009.
[pdf] [arxiv.org] [project]
Andrew Kae and Erik Learned-Miller.
Learning on the fly: Font free approaches to difficult OCR problems.
Proceedings of the International Conference on Document Analysis and Recognition (ICDAR), 2009.
[pdf]
Michael Wick, Michael G. Ross and Erik Learned-Miller.
Context-Sensitive Error Correction: Using Topic Models to Improve OCR.
Proceedings of the International Conference on Document Analysis and Recognition (ICDAR), 2007.
[pdf]
Gary C. Huang, Erik Learned-Miller, and Andrew McCallum.
Cryptogram Decoding for OCR using Numerization Strings.
Proceedings of the International Conference on Document Analysis and Recognition (ICDAR), 2007.
[pdf]

Home

Research

Databases

Information

Document Recognition: OCR

Overview

Faculty

Graduate Students

References