OCR problem

M

Murgi

Before purchasing Omnipage Pro 14, TextBridge 11, or Finereader Pro, or
another OCR software I need to know whether it can handle the following:




I have been translating part lists for the automotive industry on a daily
basis for many years. The source text is provided as printed media.

Recently I was asked to type a 6-digit part number (non-sequential numbers)
in front of every translated part. This is tedious work unless it can be
automated.
I tried to scan the numbers, but the result isn't usable for a simple reason
(interfering underlinings). Each number is written in one line and is
"underlined" like these:

123456 texttexttext

------
235673 texttexttext
------
735499 texttexttext
------

The underlinings run actually just 1 mm beneath the text (color: black). (I
don't know how
to present this style here in the newsreader.)


How can I remove these perforated underlinings? Obviously it depends on the
OCR software. Simple OCR software packages won't handle this task!
The print is otherwise clean enough to scan without problems. Which OCR
software can do what I want to achieve?

Omnipage, TextBridge or Finereader might have a function to eliminate this
"noise" it was suggested. Does anybody know whether this works or not?

I really want to automate this task since I am wasting too much time in
typing thousands of stupid numbers.


Do you have any ideas/suggestions how to solve the problem?

Thanks,
Murgi
 
J

John Corliss

Murgi said:
Before purchasing Omnipage Pro 14, TextBridge 11, or Finereader Pro, or
another OCR software I need to know whether it can handle the following:

I have been translating part lists for the automotive industry on a daily
basis for many years. The source text is provided as printed media.
Recently I was asked to type a 6-digit part number (non-sequential numbers)
in front of every translated part. This is tedious work unless it can be
automated.
I tried to scan the numbers, but the result isn't usable for a simple reason
(interfering underlinings). Each number is written in one line and is
"underlined" like these:

123456 texttexttext
------
235673 texttexttext
------
735499 texttexttext
------

The underlinings run actually just 1 mm beneath the text (color: black). (I
don't know how to present this style here in the newsreader.)
How can I remove these perforated underlinings? Obviously it depends on the
OCR software. Simple OCR software packages won't handle this task!
The print is otherwise clean enough to scan without problems. Which OCR
software can do what I want to achieve?
Omnipage, TextBridge or Finereader might have a function to eliminate this
"noise" it was suggested. Does anybody know whether this works or not?
I really want to automate this task since I am wasting too much time in
typing thousands of stupid numbers.
Do you have any ideas/suggestions how to solve the problem?

This newsgroup is supposed to be for the discussion of freeware only.
All of the programs you're talking about are commercial software.

But to answer your question, try opening the file in a word processor
and doing a search and replace for the underlining if it consists of
dashes. If it's code type underlining, simply select all text and
press the underline button twice.
 
M

Murgi

John Corliss said:
This newsgroup is supposed to be for the discussion of freeware only.
All of the programs you're talking about are commercial software.

Hi... I said "other OCR software" (which might include a free one).


But to answer your question, try opening the file in a word processor
and doing a search and replace for the underlining if it consists of
dashes. If it's code type underlining, simply select all text and
press the underline button twice.

Thanks for the hint. I'll do that right now. It might not work, but the
commercial OCR packages aren't up to the task either (hearsay at the
moment).

Best regards,
Murgi
 
M

Murgi

But to answer your question, try opening the file in a word processor
and doing a search and replace for the underlining if it consists of
dashes. If it's code type underlining, simply select all text and
press the underline button twice.

Here is a possible solution to my problems. It's still not freeware, but
worth the money if it does the trick since it costs only a fraction of what
I can save or earn in future jobs of this nature.


I approached the company that makes Omnipage and TextBridge... and
may never receive an answer.
But "Finereader" sent me a good response within a day. If this one doesn't
do the trick, I'll just outsource these daily jobs to a housewife in the
neighborhood who wants to make some extra money.

*********************

ABBYY FineReader 7.0 can recognize underlined text and allow you to erase
the underlining. Also, you can save the recognized text into a Microsoft
Word document and uncheck underlining.

You can download a trial copy of ABBYY FineReader 7.0 from our web-site:
http://www.abbyy.com/ocr_products.asp?param=28844
Or contact our sales-manager ([email protected]) to purchase a licensed copy of
ABBYY FineReader 7.0.

With best regards,
Julia Mosenkova
Technical Support Service
ABBYY Software House
Phone: +7(095)7833700
E-mail:[email protected]
http://www.abbyy.com
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top