Building on some of the tools developed by the lab and outside the lab, including a fast bi-level image convolution algorithm, cellular image processing tools, and an image vectorizer (bitmap to raster converter), we are building tools that will transform a printed/typed table of data back into a usable ASCII form. Traditional OCR methods perform poorly because of the horizontal and vertical lines separating table cells, which often overlap with part of the cell data. The most recent use of these tools includes the automated newspaper decomposition utilized at
http://www.nyt.ulib.org.
past head
- Robert H. Thibadeau
past contact
- Robert H. Thibadeau