Umax AstraNET iA101: language of OCR?

Ivan A Derzhanski · Oct 9, 2004

I bought a (used) Umax AstraNET iA101 scanner. It does images
adequately for my purposes, but when it comes to OCR there is
something I don't get.

OCR software is supposed to be enclosed, and it is. On the CD
there is a directory called \Vstascan\Ocr, and in it there are,
among other things, 38 .LCD files named after different languages
(from Afrikns.lcd to Ukrain.lcd) and 15 .LMD files of a similar kind.

However, the only way to actually do OCR according to the Operation
Manual, and the only one I've been able to find, is to push the scan
button on the front panel and choose the typewriter icon (rather than
the palette or the spreadsheet), and then the document is read into,
say, Microsoft Word format. The user doesn't get to choose the language,
or anything else about the text for that matter. I tried scanning a
Cyrillic text, and the outcome was utter garbage, because the scanner
is looking for plain (ie diacritic-free) Roman.

So my questions to anyone knowledgeable are:

* Is this scanner (or a similar model) supposed to do OCR as is,
that is, using only the software that comes with it?
*** If so, how?
*** If not, whatever are all those .LCD and .LMD files there for?

Thanks in advance!

--
<fa-al-_haylu wa-al-laylu wa-al-baydA'u ta`rifunI
wa-as-sayfu wa-ar-rum.hu wa-al-qir.tAsu wa-al-qalamu>
(Abu t-Tayyib Ahmad Ibn Hussayn al-Mutanabbi)
Ivan A Derzhanski <http://www.math.bas.bg/ml/iad/>
H: cplx Iztok bl 91, 1113 Sofia, Bulgaria <[email protected]>
W: Dept for Math Lx, Inst for Maths & CompSci, Bulg Acad of Sciences

Mendel Leisk · Oct 9, 2004

I bought a (used) Umax AstraNET iA101 scanner. It does images
adequately for my purposes, but when it comes to OCR there is
something I don't get.

OCR software is supposed to be enclosed, and it is. On the CD
there is a directory called \Vstascan\Ocr, and in it there are,
among other things, 38 .LCD files named after different languages
(from Afrikns.lcd to Ukrain.lcd) and 15 .LMD files of a similar kind.

However, the only way to actually do OCR according to the Operation
Manual, and the only one I've been able to find, is to push the scan
button on the front panel and choose the typewriter icon (rather than
the palette or the spreadsheet), and then the document is read into,
say, Microsoft Word format. The user doesn't get to choose the language,
or anything else about the text for that matter. I tried scanning a
Cyrillic text, and the outcome was utter garbage, because the scanner
is looking for plain (ie diacritic-free) Roman.

So my questions to anyone knowledgeable are:

* Is this scanner (or a similar model) supposed to do OCR as is,
that is, using only the software that comes with it?
*** If so, how?
*** If not, whatever are all those .LCD and .LMD files there for?

Thanks in advance!

Not answering your question directly, but I've tried a few ocr
programs and settled on ABBYY FineReader, as having quite high
accuracy rate, good formatting, lots of outputting options (from plain
text copied to clipboard through to fully formatted word documents
with embedded jpegs) and reasonably easy to use. The price has climbed
a lot since ver. 5 (it was around $90us for pro version), though. Ver.
6 has accuracy/formatting somewhat improved from 5. Ver. 7 wasn't much
of a change, as far as I could tell.

Any flatbed scanner with 300 dpi should do. You can download a trial,
I think, it's been a while so not sure. Google the name.

Umax AstraNET iA101: language of OCR?

Ivan A Derzhanski

Mendel Leisk