Digitize your documents

We have tips on scanners, OCR software, Web OCR, and converting your books to e-books.

The space required to store paper documents can be a problem. Digitizing your documents renders them exquisitely portable--you can store an entire library on your e-book reader with ease. And because paper documents can be turned into editable computer documents, they become searchable. Compare typing "Roosevelt" in a search field with spending all day scanning microfiche and old newspapers by eye to research the Square Deal or the New Deal. The digital document is a boon to researchers the world over.

You can store documents digitally in one of two ways--as images or as text files. Images require far more space, but retain the character and flavor of the original document. Converting a scanned image to a text or word processing file involves what's called optical character recognition, or OCR. It's a bit of misnomer, since you're actually processing digital information, but the term has stuck.

If the original document was written by hand or is art, storing it as an image is generally more desirable--the style of the handwriting can be as meaningful as the words themselves. The other reason for storing handwritten documents as images is that there are no commercially available handwriting recognition packages that can interpret handwritten characters from scans. So far, it's a technology stuck in the PDA and tablet world. Anne-Sophie Bellaud of Vision Objects (a purveyer of handwriting recognition software) explains that with tablets you know the order in which hand-printed or -scripted characters were entered. This provides huge clues for the software. Without an entry timeline, handwriting is not nearly as easy to recognize.


No matter which way you'll be storing your documents--as images or as text files--you'll need a scanner to digitize them. If you have relatively few documents to process, a multifunction printer or a dedicated flatbed scanner such as those discussed in "Digitize Your Pictures" will suffice. They're relatively slow, however, and only the more expensive models have automatic document feeders to handle multipage documents.

Though pricey, sheet-fed scanners are just the ticket if you need to process a lot of documents. Units such as Fujitsu's US$495 ScanSnap S1500 and HP's $450 ScanJet Professional 3000 scan both sides of a document at once and average 20 pages per minute or better. I'll give the HP props for slightly more reliable paper feeding with mixed document types, but the Fujitsu has the superior, better-integrated software.

OCR Software

Most scanners ship with OCR software that you can install on your PC, but if yours lacks it, you can buy the software separately. ABBYY's $50 FineReader 9 Express ($400 for Pro 10), Nuance's $150 OmniPage 17 Standard (the Pro version is $500), and Adobe's $299 Acrobat X Standard (Pro is $449) are all good choices. Nuance's $100 PaperPort 12 Standard (Pro is $200) also scans, does OCR, and adds document management features that make it easier to keep track of your documents. Less expensive versions exist for most of these programs, so slow your heart rate.

In my hands-on tests with clean 300-dpi scans, Acrobat did the best job of converting documents, followed closely by FineReader, and not so closely by OmniPage and PaperPort. But the latter three products did better with the three low-quality, 150-dpi scans that I included among my test documents.

For documents stored as images, 150 to 200 dpi is usually fine, but OCR software works much better with 300 dpi scans. Much depends on your needs. If you just want to retain legibility, you may be able to drop the dpi and reduce your storage requirements.


Several online services--such as www.free-ocr.com, www.newocr.com, and www.ocronline.com--are good for small-scale projects or one-offs. First you scan the original to your PC, then upload the document to the Website.

The services have limitations: My tests yielded results that weren't very accurate. Also, only text is recognized, not lines and other page elements.

The first service mentioned above, www.free-ocr.com, is free, but files can be no larger than 2MB, and no wider or higher than 5000 pixels (about 150 dpi for a letter-sized page); and you can do no more than 10 uploads per hour.

Another service, www.newocr.com, is also free, but the interface is primitive. It does a much better job, though, of pulling text than free-ocr.com, and it allows documents up to 5MB in size.

Finally, www.ocronline.com requires creating a free account, but allows 4MB images (about 200 dpi per page) and up to 15 uploads per hour. You get 10 free credits, but after that you must pay for them. The site sells credits in varying quantities, from 50 for $3.95 (8 cents per page) up to 5000 pages for $49.95 (1 cent per page). I got good results with this service, which handles graphic elements as well as text, though it wasn't up to the standards of Acrobat X or FineReader 10.


There's nothing like the feel, smell, and visual stability of a real book, but more and more people are happily reading virtual books using Kindles, Nooks, iPads, and other devices. You simply can't beat their portability, and the texts are searchable. It's even possible to have a decent reading experience on smartphones and iPods; I use the latter and, no, the frequent page-turning does not bother me, though I'll undoubtedly go for something larger eventually. You can purchase most books from an online store, but you may have some books in your own collection that aren't available in digital format.

To convert a physical book into an e-book requires first scanning it page by page, and then, for lack of a better term, OCR'ing it. This is tedious at best--use a fast scanner. If you are willing to destroy the book, or know how to rebind, use a sheet-fed scanner (see "Scanners," above). Most of the aforementioned OCR programs have features that help organize the pages.

Once you have the text file (in PDF, Word, or other format) in place, grab Calibre--a very capable and free e-book reader, organizer, editor, and publisher. Convert the file to the format appropriate for your device--EPUB or PDF, say. Once you've created a viewable file, use a reader app such as Stanza to load the e-book onto your device. Your device or app must support side-loading--that is, loading from a PC.

Join the newsletter!


Sign up to gain exclusive access to email subscriptions, event invitations, competitions, giveaways, and much more.

Membership is free, and your security and privacy remain protected. View our privacy policy before signing up.

Error: Please check your email address.

Tags ScannersperipheralsConsumer Adviceocr

Keep up with the latest tech news, reviews and previews by subscribing to the Good Gear Guide newsletter.

Jon L. Jacobi

PC World (US online)
Show Comments

Cool Tech

Toys for Boys

Family Friendly

Stocking Stuffer

SmartLens - Clip on Phone Camera Lens Set of 3

Learn more >

Christmas Gift Guide

Click for more ›

Brand Post

Most Popular Reviews

Latest Articles


PCW Evaluation Team

Aysha Strobbe

Microsoft Office 365/HP Spectre x360

Microsoft Office continues to make a student’s life that little bit easier by offering reliable, easy to use, time-saving functionality, while continuing to develop new features that further enhance what is already a formidable collection of applications

Michael Hargreaves

Microsoft Office 365/Dell XPS 15 2-in-1

I’d recommend a Dell XPS 15 2-in-1 and the new Windows 10 to anyone who needs to get serious work done (before you kick back on your couch with your favourite Netflix show.)

Maryellen Rose George

Brother PT-P750W

It’s useful for office tasks as well as pragmatic labelling of equipment and storage – just don’t get too excited and label everything in sight!

Cathy Giles

Brother MFC-L8900CDW

The Brother MFC-L8900CDW is an absolute stand out. I struggle to fault it.

Luke Hill


I need power and lots of it. As a Front End Web developer anything less just won’t cut it which is why the MSI GT75 is an outstanding laptop for me. It’s a sleek and futuristic looking, high quality, beast that has a touch of sci-fi flare about it.

Emily Tyson

MSI GE63 Raider

If you’re looking to invest in your next work horse laptop for work or home use, you can’t go wrong with the MSI GE63.

Featured Content

Product Launch Showcase

Don’t have an account? Sign up here

Don't have an account? Sign up now

Forgot password?