Bulk Document Scanning

Written by Helen Glenn Court
Bookmark and Share

Bulk document scanning today, by and large, is handled by specialty shops. The reason is simple enough. Organizations looking into these solutions are usually migrating from a paper-based to a digital information system. That's a vast, challenging, and time-intensive undertaking. For the most part, investing in drum scanners or extremely high-end flatbeds isn't cost effective for a one-time job. What's more, for a conversion to be efficient, expertise and technical quality control are critical in the three stages of the process.

The Bulk Document Scanning Process

Ten or more years ago, drum scanners were the only choice for high-end image and bulk document scanning jobs. Their technology is particularly appropriate for high resolution images. The advances in flatbed scanner technology, however, have changed the picture--so to speak--somewhat. The decided advantage that high-end flatbeds have over their drum counterparts is a design with the capacity to scan all types and sizes of originals, including three dimensional documents such as books and magazines.

The critical factor in bulk document scanning is converting the image that the scan yields back to text. The software application that performs this magic--on the basis of character and pattern recognition--is known as optical character recognition, or OCR. It has advanced by leaps and bounds in the last decade. Quality control, however, is always critical to ensure accurate conversion. No OCR program, no matter how sophisticated, can convert at 100 percent.

Once the data is captured to image and converted back to text, it is electronically identified. This is the key to the "ready access" advantage that electronic systems have. Called metadata indexing, these criteria include file type, document name, subject matter, author, creation date, status, key words, and the like. Depending on the size and type of the organization--a law firm or publishing house are good examples--indexing may call for as many as 100 indexing fields. Most businesses, however, will use far fewer.


Bookmark and Share