blog

Use Optical Character Recognition To Capture Useful Information

student uses optical character recognition to use scanned book pages

Turn scans of book pages and photos of handwritten notes into PDFs, then use optical character recognition to make their content editable. Learn more.

You’re at a museum doing research for a paper you need to write. To capture the useful information in the labels underneath the artworks, you take photos of them using the camera on your phone.

That saves you time on the spot, but when you get home, you still have to go through each photo and manually type the text on your computer—a task you neither enjoy nor have the time for. Worst of all are the typos you keep making in the process.

What’s a student to do? If you’re looking for a way to make captured information instantly usable, keep reading.

Eliminate manual data entry

You can free yourself from having to type information contained in scans and images by turning them into machine-readable text. Using technology that recognizes the characters within images and scans, you can make text from photos of handwritten notes of scans of book pages instantly usable—meaning you can select, copy, and edit the text in them.

Here’s how this technology works:

  • First, it distinguishes dark from light in your document, recognizing all that’s dark as characters to be read and all that’s light as background.
  • Second, it processes the characters, distinguishing them as either numbers or letters.
  • Third, it uses pattern recognition and/or feature detection to identify each character.
  • Finally, it converts each character identified into American Standard Code for Information Interchange (ASCII) code—the most common format for text files in computers—to make it usable.

That’s it. The process—known as optical character recognition (OCR)—takes just a few seconds, and allows you to:

  • Work with content without having to retype it so you can quickly turn scans and graphic files into workable PDFs.
  • Make files searchable so you can quickly find the information you need.
  • Edit the content of files so you can make updates and corrections as needed.
  • Ditch paper so you can save space and preserve your documents for longer.
  • Enable text-to-speech conversion, so you can make the content of documents accessible to the blind and the visually impaired.

PDFpen’s optical character recognition

PDFpen uses OmniPage, one of the world’s most accurate OCR engines according to TechRadar. PDFpen’s OCR technology enables you to:

  • Batch OCR, i.e., perform optical character recognition on multiple files to save time
  • OCR documents in multiple languages (see full full list) so you can make text searchable in French, Portuguese, Danish, German, and more.

What users say about PDFpen’s OCR technology

Here’s what our users say about our OCR technology:

“PDFpen and PDFpenPro do very good OCR.”
– Brett Burney 

“I love how it can take a basic PDF document and apply optical character recognition and, even in a pinch, convert it to a workable Word document. It’s a tool I use almost daily.”
– David Sparks 

“PDFpenPro’s batch OCR tool is very important to me.”
– Thomas H. Vidal 

“PDFpen is by far the easiest to use for this [OCR]. When you open an image document that’s got text on it, PDFpen recognizes this and offers to scan it for you.“
– Apple Insider

Try our optical character recognition engine free for 30 days

You can use PDFpen to turn scans and images into editable text. You can also use it to:

  • Combine and customize study materials
  • Organize study materials
  • Highlight PDFs
  • Add text and audio notes and comments to PDFs
  • Add text to PDFs
  • ...and more

Explore these and other features today.Download a free trial of PDFpen. If you like it, check out our Education Store to learn about discounts for students.