Scanning and Specialized OCR Systems
Narrator: Optical Character Recognition systems are one of the tools that allow people who are blind or visually impaired to access printed information.
There are three essential elements to these systems: scanning, optical character recognition, often referred to as OCR, and the reading of the text via synthesized speech.
To use this technology, users require three components: a flatbed scanner, a PC with a compatible sound card, and a specialized OCR software program with speech output. With this technology, users can scan printed, but not hand written, text and either have it read back in synthetic speech or save it to their computer as a file that can be accessed later. When users place a printed document in the scanner and issue a command to begin the scan, the scanner takes a picture of the printed text and sends it to the computer. The OCR software then analyzes the image, recognizes the characters, and converts the information into an electronic file. This file is passed to the built-in screen reader, which uses the computer’s sound card or dedicated speech synthesizer to speak the text.
Sample of synthesized speech: “Converting hard copy printed material into properly formatted braille documents can be one of the most challenging tasks for individuals…”
The recognition process in the OCR system takes account of the logical structure of the language. It also uses a feature that applies spell checking techniques similar to those found in word processors.
All OCR systems create files containing the characters and page layout of the text. With some OCR systems, users can convert these files into formats retrievable by commonly used software such as word processors, spreadsheets, and databases. Users can then access the scanned text with adaptive technology devices that magnify the computer screen or provide speech or braille output.
While OCR technology is highly accurate when scanning straight text, accuracy can be greatly decreased if the quality of the print scanned is not good or if the document contains mixed columns, charts, diagrams, or graphics. It’s important, therefore, for users to understand that OCR technology is not a miracle tool that can be counted on for 100% accuracy in all circumstances.
There are a large number of scanners available but the specialized OCR software doesn’t work perfectly with all models. Before purchasing a scanner, users should visit vendors’ websites and review the scanner models recommended. There are several pertinent questions that users should ask:
Does the OCR system require installation into a PC or is it a self-contained unit?
Does it recognize a wide variety of typewritten and typeset documents including books, magazines, mail order catalogs, newspapers, and bank statements?
Will it maintain the layout of the original text and recognize columns of text with a minimum of user intervention?
Does it require a minimum of computer knowledge to operate?
Does it come with documentation that is easy to understand and in an accessible medium such as large print, braille, or on cassette tape?
Will it provide online help that can be accessed while using the system and does it come with ongoing technical support from the manufacturer?
Will it scan material at an efficient speed?
And finally, will it handle various sizes of paper and horizontally formatted documents?
The specialized OCR software packages cost around $1,000. This does not include the cost of the computer or the scanner, which vary according to the desired specifications.
The current generation of OCR systems provides good accuracy and formatting capabilities with straight text, at prices that are up to ten times lower than a few years ago. These systems represent a worthwhile investment for users who need to access printed documents of all types.