Getting text from PDF

K

Klaus Jensen

Hi!

I need to extract all text from PDF-files for fulltext-indexing purposes.
How do I do that?

I have looked at several PDF-components, but none of them have features to
read the text in the PDF - only create PDF's.

Using an applicaton (or indexing service) to search the pdf is not what I
need, I need to extract the text and store it in a database.

Any pointers and help will be greatly appreciated.

Thanks in advance

Klaus Jensen
 
B

Brian Henry

if you are using SQL Server all you need to do is install the adobe PDF
Ifilter and it will full text index it for you automatically
 
K

Klaus Jensen

Brian Henry said:
if you are using SQL Server all you need to do is install the adobe PDF
Ifilter and it will full text index it for you automatically

Hi Brian

Thanks for your response!

Unfortunately that would mean having to store the PDF's in the SQL Server,
and I am talking about 1 gig of data a day... Im afraid it is not an option.

- Klaus
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top