This provides you with the glossary terms grouped under six headings. Click an item to view the topic.
To have these glossary topics listed alphabetically, click here.
ADF: An Automatic Document Feeder allows multiple pages to enter a scanner without each page being placed separately. Some ADFs are built into scanners; others are add-on products. OmniPage accepts input from ADFs. This is useful for scheduled jobs.
Flatbed scanner: A scanner in which the paper remains stationary and the scanning head moves. Such scanners generally produce higher quality images than comparable sheetfed scanners. A flatbed scanner needs an ADF to scan multiple sheets automatically.
Autoscan: This is for scanning long documents with a flatbed scanner without an ADF. It starts new scans at regular user-defined intervals. A dialog box allows each pause to be lengthened or shortened.
Sheetfed scanner: A scanner in which the scanning head remains stationary and the paper moves. Such scanners usually occupy less space and have a built-in ADF, but generally a comparable flatbed scanner will produce higher quality images.
Dpi: Dot per inch (dpi) is a measure of the resolution (pixel density) of an image. The optimum resolution for OCR is 300 dpi.
Resolution: This is a measure of pixel density in an image. It is measured in dots per inch (dpi). OmniPage can accept images up to 600 dpi, but the ideal resolution for OCR is 300 dpi. Resolution can be specified for saving image files in the converter options dialog box.
Brightness: A measure of how dark or pale a scanned image will be. OCR quality depends heavily on good brightness settings. An image where letter shapes run together is too dark. When letter shapes are thin or broken, the image is too light. Brightness can be changed in the Image Enhancement window. Many scanners have an auto-brightness feature.
Contrast: A measure of how much difference there will be between light and dark parts of an image. OCR quality depends heavily on good brightness and contrast settings. Not all scanners allow manual adjustment of contrast. Contrast can be improved in the Image Enhancement window.
Scanner Setup Wizard: A program that uses information from a user as well as pre-set settings to configure OmniPage to communicate with a scanner for the best possible scanning results.
Scanner drivers: Computer files created by a scanner manufacturer that allow communication between the scanner and the computer's operating system. Many scanner models have several versions of drivers that are each compatible with certain versions of Windows. Many scanner manufacturers have drivers on their web sites available for download.
Native User Interface: The scanner interface provided by your scanner manufacturer. By choosing to show the Native User Interface, it is possible to change the settings that the Scanner Setup Wizard has set for the scanner.
TWAIN: Technology Without An Interesting Name. An accepted set of standards for many scanners and digital cameras. Scanners must be TWAIN compliant to function correctly with OmniPage. Scanner manufacturers can be asked which of their scanners are TWAIN compliant.
Image: An electronic picture of text (and/or graphics) from a scanned paper document or an image file. Images do not have editable text characters; they have many tiny dots (pixels) that together form a picture of text. The best pixel density (resolution) for OCR is 300 dpi.
Image file: A file that contains one or more page images, typically created from a scanner. Image files may present pictures and/or text. Image files with text can be processed by optical character recognition (OCR) to generate editable text. OmniPage supports a wide range of image file formats.
Original image: The image obtained from a scanner or an image file before it enters the program. It may be black-and-white, grayscale or color. It becomes the Primary image when loaded and displayed in the Page Image panel. The Primary image may not be the same as the original image, depending on your pre-processing settings.
Background: Each page image has a background value: process or ignore. This value can be changed from the Image toolbar. There are also process and ignore zones. All process areas are auto-zoned with recognized text and graphics transferred to the Text Editor.
Zones: Areas enclosed by borders drawn over the page image in the Page Image panel. Zones have a zone type and may have a zone contents value. Both can be automatically or manually assigned. Zoning, the creation of zones, can be done manually or automatically or with a combination of both. Areas inside Ignore zones are always ignored during OCR.
Auto-zoning: The automatic drawing of zones on a page image. Elements inside process zones and on a process background are auto-zoned. Auto-zoning is influenced by the original layout description setting. Alternatives to auto-zoning are manual or template zoning.
Manual zoning: This is the process of drawing user-defined zones on an image. Alternatives are auto-zoning or template zoning. Manual zoning can be done as part of manual processing, by interrupting automatic processing, or by running a workflow containing the Zone Images step with the Display images for manual zoning option.
Zone contents: Defines the set of characters OmniPage will accept from a given text or table zone during OCR. Alphanumeric zones allow all characters needed for the current language choice. Numeric zones generate text containing only numbers and number-related punctuation. A zone's shortcut menu allows the zone contents to be specified.
Zone properties: The Zone types and the Zone contents together constitute the Zone properties. Zone properties can be changed by right-clicking inside a zone for a shortcut menu.
Zone types: This influences how OmniPage handles the contents of the zone. Auto-generated zone types depend on the input page description. There is a zone drawing tool for each zone type: Process, Ignore, Text (with three sub-types), Table, Graphic and Form. Zone types can be changed with a shortcut menu.
Process zone: A page image area that will be auto-zoned and recognized. After processing it may result in one or more text, table or graphic zones. Page backgrounds can also be designated as "Process".
Ignore zone: A page area to be excluded from processing. There will be no OCR or graphic transfer from this area. It is displayed as shaded (appears gray on white pages). Page backgrounds can also be designated as "Ignore".
Process background: A page area that will be auto-zoned with recognized text and graphics transferred to the Text Editor. Process backgrounds take the color of the scanned page (typically white). All areas outside manually drawn zones on a process background will be auto-zoned. A process background can be changed to an ignore background from the Image toolbar.
Ignore background: A shaded page area that will be ignored during processing. Zones drawn on this background define which page areas will be processed. An ignore background can be changed to a process background from the Image toolbar.
Zone template: A background value and a set of zones and their attributes, including shape, size, position, zone contents and zone type, that are saved to a template file. Zone templates can be created and used for documents with the same layout zoning requirements. Any number of zone template files can be saved and embedded in an OPD file, but only one can be loaded at a time.
Standard toolbar: This toolbar offers buttons for performing basic program functions. By default is it located horizontally under the menu bar. It can be floated, docked elsewhere and hidden.
OmniPage Toolbox: The area on the OmniPage Desktop that contains the Start button (for automatic processing and starting a selected workflow), and the Get Page, Perform OCR and Export Results buttons for manual processing. Drop-down lists serve for selecting options. The OmniPage Toolbox in Quick Convert View has three buttons: Get and Convert, Get Page and Convert Document.
Page Image panel: This is one of the main screen areas in the OmniPage Desktop. It can display the current page image with its zones. The display is controlled by buttons on the status bar.
Image toolbar: This contains the background, zone, table and image tools. By default, it appears vertically to the left of the current page displayed in the Page Image panel.
Easy Loader: A panel in OmniPage that provides one-click file loading, and in Quick Convert View also allows one-button file processing: load, recognize, save.
Thumbnails: Each document page can be displayed as a numbered thumbnail in the Thumbnails panel. The current page is shown with an 'eye' icon. Selected pages have a distinctive appearance. Thumbnails allow pages to be opened, moved or deleted. Access to further thumbnails is by scrolling. Icons show page state. In the Thumbnails panel, document title bars with a Show Thumbnails button represent inactive documents.
Text Editor: This is one of the main screen areas in the OmniPage Desktop. Recognized pages are displayed here in one of three Editor formatting levels. Proofing and reading text aloud are done through the Editor. The WYSIWYG Editor allows font, paragraph and page level editing before export. The Editor can be given more screen area by moving splitters and hidden or displayed by clicking the Text Editor button in the status bar.
Document Manager: This is one of the main screen areas in the OmniPage Desktop. It displays a table whose columns provide statistical and status information for each page. Each row relates to one page. The columns to be displayed and their order can be customized. The Document Manager allows page operations to be done.
Workflow Assistant: This guides you in creating or modifying workflows by offering the appropriate steps represented as icons. With the Workflow Assistant, workflows can be created either from scratch or by modifying existing ones. The Workflow Assistant is also called by the Batch Manager to let you define or modify the processing steps needed for jobs.
Shortcut menu: A menu that pops up at the cursor position when the right mouse button is clicked, listing the most frequently used commands for the current screen area.
Splitter: This is a vertical or horizontal border separating the panels on the OmniPage Desktop. A splitter can be moved by placing the cursor on it and dragging it to the desired location.
Text Editor formatting levels: The Text Editor can display recognized pages in three levels. Buttons bottom left of the Editor allow switching between levels. From left to right: True Page: Styling retained. All page elements including columns placed in boxes or frames to conserve original page layout. Formatted Text : Decolumnized text with font and paragraph styling. Plain Text : Plain decolumnized text in a single font and style. The formatting level to be used for saving is specified separately at saving time. formatting levels
Text Editor formatting levels: The Text Editor can display recognized pages in three levels. Buttons bottom left of the Editor allow switching between levels. From left to right:
True Page: Styling retained. All page elements including columns placed in boxes or frames to conserve original page layout.
Formatted Text : Decolumnized text with font and paragraph styling.
Plain Text : Plain decolumnized text in a single font and style.
The formatting level to be used for saving is specified separately at saving time.
Plain Text: This is one of three Text Editor formatting levels. Plain Text view shows plain decolumnized text in a single font and style. It can contain graphics and tables. This level may be easiest for making text corrections. Use the buttons at the bottom left of the Text Editor to switch between levels. The levels from left to right are: True Page, Formatted Text and Plain Text.
Formatted Text: This one of three Text Editor formatting levels. It displays decolumnized text with font and paragraph styling detected and retained. Graphics and tables can appear. Use the buttons at the bottom left of the Text Editor to switch between levels. The levels from left to right are: True Page, Formatted Text and Plain Text .
True Page: A Nuance technology that replicates the original page layout and formatting as closely as possible. It is one of three Text Editor formatting levels. True Page may contain simple elements (text boxes, pictures and tables) and complex elements (frames and multicolumn areas). The border color of each element denotes its contents.
Frame: An area enclosing one or more boxes (text, table, picture) in the Text Editor's True Page formatting level. A frame is placed when a visible frame border or shaded area is detected on an original image. Frame properties can be changed. Elements inside a frame can be reordered.
Multicolumn area: An area of a recognized page enclosed by an orange border in the Text Editor's True Page formatting level. These areas are determined automatically to group columns of flowing text, possibly along with pictures and tables. These areas can be ungrouped.
Recognized text: Text that has been processed (recognized). This appears in the Text Editor after OCR, in one of three formatting levels. The text may include tables, may be arranged in columns or decolumnized. Pictures may be stored together with the text. Text in the Editor can be proofed, edited or spoken aloud before it is saved or exported.
OCR Proofreader: This is part of the Text Editor. During proofing the program stops on suspect and non-dictionary words displaying the image and the OCR solution. It may offer dictionary suggestions. Words can be added to a user dictionary. IntelliTrain runs during proofing, if turned on. Proofing is better done before large-scale editing such as cutting or pasting text blocks.
IntelliTrain: A process to improve OCR results, based on user corrections during proofing. The image shape of a corrected character is used to search similar shapes in the document, especially in suspect words. After further checking, the user's change may be applied to these cases. IntelliTrain is automatic, but can be turned on or off. IntelliTrain solutions can be edited and saved to a training file for future use.
Training: This is the process of associating character shapes (images) with the characters they represent. Training serves to improve OCR accuracy on long and similar styled documents. Manual training allows user-selected characters to be trained. Automatic training is called IntelliTrain. All training can be saved to a training file.
Training file: A file containing a library of character shapes, each assigned to a character. OmniPage uses these pre-defined solutions to make more confident OCR decisions. Training can be useful on text in an unusually styled typeface or with uncommon symbols. A training file should be loaded only for pages with a similar typographic style to the pages where the training was done. Any number of training files can be saved, only one can be loaded at a time. Training files can be edited. Training should not be applied to Asian languages.
User dictionary: A list of words, such as sector-specific terminology, that OmniPage can use during recognition and proofing, in addition to a built-in dictionary. User dictionaries can be edited and can be compiled from existing text files. Microsoft Word's user dictionaries can be used.
Professional dictionary: A specialized dictionary for certain professions and languages. These dictionaries are consulted in addition to the standard and user dictionaries during OCR and proofreading.
Text-to-Speech: OmniPage can read text aloud in a number of languages. The RealSpeak Solo language modules convert recognized text into speech. The text section to be read is defined in the Text Editor, by mouse and/or keyboard.
SAPI: Speech Application Programming Interface. This is a standardized interface for Text-to-Speech applications. Nuance RealSpeak is SAPI-compliant.
URL support: Detected E-mail addresses and web addresses are placed in the Text Editor as hyperlinks. Hyperlinks can also be created in the Text Editor to open a linked web page or file.
URL support: Detected E-mail addresses and web addresses are placed in the Text Editor as hyperlinks.
Hyperlinks can also be created in the Text Editor to open a linked web page or file.
WYSIWYG: What-you-see-is-what-you-get.
Unicode: This is a universal system of coding characters allowing virtually all characters and symbols in the world's languages to have a unique coding value. OmniPage offers it as an output file type for use with unformatted text. It is useful for multi-lingual documents. Recent versions of Word and other word processors contain multi-lingual support, so it is possible to export fully-formatted multi-lingual texts to them without explicitly choosing Unicode.
Statistics: These are available for each page in a document as columns in the Document Manager. Document totals are also available. Statistics available include reading speed (words/min.) and accuracy (percentage of reject to total characters). A user can choose which statistics to view.
OCR: Optical character recognition is the process of extracting text from an image. This image can result from scanning a paper document or opening an electronic image file. The text becomes computer-editable, so that it does not have to be retyped manually.
LFR: Logical Form RecognitionTM. A recognition algorithm used to transform static forms into electronically fillable forms in OmniPage Professional. LFR runs on pages with Form as Layout Description and inside Form zones. Form editing tools are available only when LFR has run. See also: Form Data Extraction (FDE).
FDE: Form Data Extraction. An efficient method of form processing inside a workflow. An active PDF form must be used as a template and output is to CSV text files that can be interpreted by database programs. See also: LFR.
Workflow: A workflow consists of a series of steps and their settings. A workflow may include for example, a scanning, recognition, proofing and one or more saving steps. The Workflow Assistant serves for creating and modifying workflows. Steps are represented as icons and are offered for selection. In addition, workflows form an integral part of all jobs.
Automatic processing: An efficient processing method. Pages or whole documents are processed from start to finish using current settings.
Manual processing: A versatile processing method that permits step-by-step and page-by-page processing. It allows manual image enhancement, zoning and proofing and gives maximum control over settings.
On-the-fly processing: Zone changes in the Page Image panel made on a recognized page can be immediately processed to change the page in the Text Editor on-the-fly.
OmniPage Document: The program's proprietary file type (*.opd). OmniPage Documents can consist of page images, zones, recognized text, settings and training data. OmniPage Documents allow verification, proofing, editing or adding pages to be done when the document is reopened in future sessions. User dictionaries, training files, zone templates, or image enhancement template files can be embedded in OPDs and extracted at a remote location.
Page layout description: This influences how the program auto-zones pages. Choices are: Auto, Single-column without table, Single-column with table, Multiple-column without table, Spreadsheet, Form, Legal Pleading, Template and Custom. Help details the use of each type.
Options dialog box: This is the central location for most OmniPage settings, accessed from the Tools menu or a Standard toolbar button. Groups of settings can be viewed and selected by clicking tabs in the dialog box.
Portable Document Format (PDF): A document format widely used in web pages and for displaying documents. OmniPage can open PDF files and create an editable version of their displayed texts. It can also save recognition results to five variants of PDF files: viewable only, viewable with image replacements of uncertain characters, viewable and searchable, viewable and editable, and edited.
Direct OCR: This lets you scan / load and recognize pages and insert results at the cursor position in an open document in a Microsoft Office application or in WordPerfect. Direct OCR should be enabled for these applications in OmniPage under Tools > Options > General. Direct OCR places two buttons: Acquire Text and Acquire Text Settings in the enabled applications.
ODMA: Open Document Management Application Programming Interface. This interface makes the functionality of Document Management Systems (DMS) and Enterprise Content Management (ECM) systems accessible for other programs. OmniPage is able to work with this interface.
Job: A job has a name, a type, timing and a workflow. The Batch Manager calls the Job Wizard to create new jobs and modify existing ones. The Job Wizard handles the job name, type and timing. The Batch Manager then calls the Workflow Assistant to let you define a series of processing steps and their settings. Any number of jobs may be specified, each to be started at a given time. One or more export files can be specified for each job.
Batch Manager: This is a separate but integrated program that lets you schedule jobs to be processed at some time in the future. The Batch Manager first calls the Job Wizard and then the Workflow Assistant.
Job Wizard: serves for naming jobs, selecting a job type and specifying start and stop times and other options.
Watched folders: Input folders for Folder watching jobs, Barcode cover page jobs and Mailbox watching jobs (Outlook or Lotus Notes). The Job Wizard in the Batch Manager serves for selecting folders to be watched for arriving files of specified types. Watched folders allow processing to be started automatically whenever image files are placed in these pre-defined folders.
Barcode cover page job: A job type in the Job Wizard of OmniPage Professional. Barcode cover page jobs consist of a workflow (for describing the processing steps), a barcode cover page (for identifying a workflow) and the timing instructions for folder watching. The starting time for processing is defined by the moment the barcode cover page enters a watched barcode folder. Separate folders should always be used for barcode processing.
Folder watching job: A job type in the Job Wizard of OmniPage Professional. Folders can be specified to be watched for incoming image files. Files entering a watched folder from local or remote locations will be processed automatically on arrival.
Mailbox watching job: A job type in the Job Wizard. Mailbox watching jobs serve for extracting attachments from incoming mail messages. These jobs allow processing to be started automatically whenever image file attachments are placed in predefined folders in your mailing system. Mailbox watching jobs are available only if a supported mailing system is installed and functioning properly on your computer.
Target application: This is a program destined to receive recognized text and/or images from OmniPage. Typically it is a word processor, a Desktop Publishing (DTP) program, a spreadsheet, a Web editor, an E-mail facility or a Kindle device. The Save to File dialog box offers a large number of target applications to which recognition results can be saved. OCR results can also pass to target applications by drag-and-drop, Clipboard or Direct OCR.
ECM
DOCX: DOCX is the default file type in Microsoft Word 2007.
MAPI: Mailing Application Programming Interface. This is a standardized interface for mailing applications. OmniPage can send recognition results to any MAPI-compliant system. MS Outlook and MS Exchange are MAPI-compliant.
MRC technology: A high compression technology used for color and grayscale PDF images or PDF Searchable Images (PDF with image on text).
Open passwords can be applied to PDF files created in OmniPage or via Nuance PDF Create. A protected PDF can be opened only by someone with this password (or its permissions password). Also known as a User Password.
Permissions passwords can be applied to PDF files created in OmniPage or via Nuance PDF Create. Only people having this password can perform defined actions, such as opening, copying, printing or modifying the PDF file. A permissions password is needed to change the permissions that are set. Also known as a Master Password.
Page formatting level: An option for defining how much formatting should be saved with a document. The choices offered depend on the saving file type. Three levels (True Page, Formatted Text, and Plain Text) are available for display in the Text Editor. Other saving choices can be Flowing Page and Spreadsheet.
XML: XML is an abbreviation for Extensible Markup Language. It is an open standard for defining data elements. XML uses a tag structure (like HTML).
XPS: XPS is an abbreviation for XML Paper Specification. XPS documents can be displayed in an XPS viewer that works with Internet Explorer. When an XPS file is double-clicked, usually Internet Explorer opens and displays it. XPS documents are similar to PDF documents; they preserve the original document layout and display text with the same fonts as the original.