rth wrote:
> John Corliss wrote:
>> rth wrote:
>>
>>> any freeware to do pdf -> html ? especially pdf's with images and
>>> tables / columns in them ??
>>
>> Yesterday I looked and looked. Found one named "Clickcat PDF-to-HTML"
>> which you should avoid because the free version has this limitation:
>>
>> "In the generated HTML code some letters are exchanged."
>>
>> Also, the download is 43 mb!
>> I did, however, find PDF2HTML at Snapfiles:
>>
>> http://www.snapfiles.com/get/pdf2html.html
>>
>> Note that it's a command line too though. Homepage is here:
>>
>> http://sourceforge.net/projects/pdftohtml/
>
> thanks John.. got the whole bit, plus the gui and ghost but it didn't do a
> very good job...
Sorry that didn't work for you.
> I'd as soon settle for something that'd pull the text out
> as plain text and find some other way of sucking out the pictures... if I
> could get that stuff separately then I could re-author the doc as html by
> hand.
You can pull the text out by selecting and copying it right inside
Acrobat Reader (I use version 5.0.) It even retains some of the
formatting if you paste into Wordpad instead of Notepad. As for the
images, that would take using your "Printscreen" button and then
pasting the clipboard into something like PhotoFiltre. Next, either
crop the resulting images to remove the rest of the page in the .pdf
file you copied from, or select the images you want from the paste
then copy, paste and save *those* as new images in the image editor
you're using. You can then combine the images and the text in your
favorite HTML editor to make the page.
Note though, that copying images in this way may lead to a reduction
in image quality.
HTH
--
Regards from John Corliss
No adware, cdware, commercial software, crippleware, demoware,
nagware, shareware, spyware, time-limited software, trialware, viruses
or warez please.