PC Review


Reply
Thread Tools Rate Thread

convert pdf to html ??

 
 
rth
Guest
Posts: n/a
 
      28th Sep 2004
any freeware to do pdf -> html ? especially pdf's with images and tables /
columns in them ??


 
Reply With Quote
 
 
 
 
John Corliss
Guest
Posts: n/a
 
      28th Sep 2004
rth wrote:

> any freeware to do pdf -> html ? especially pdf's with images and tables /
> columns in them ??


Yesterday I looked and looked. Found one named "Clickcat PDF-to-HTML"
which you should avoid because the free version has this limitation:

"In the generated HTML code some letters are exchanged."

Also, the download is 43 mb!

I did, however, find PDF2HTML at Snapfiles:

http://www.snapfiles.com/get/pdf2html.html

Note that it's a command line too though. Homepage is here:

http://sourceforge.net/projects/pdftohtml/

--
Regards from John Corliss
No adware, cdware, commercial software, crippleware, demoware,
nagware, shareware, spyware, time-limited software, trialware, viruses
or warez please.
 
Reply With Quote
 
rth
Guest
Posts: n/a
 
      29th Sep 2004
thanks John.. got the whole bit, plus the gui and ghost but it didn't do a
very good job... I'd as soon settle for something that'd pull the text out
as plain text and find some other way of sucking out the pictures... if I
could get that stuff separately then I could re-author the doc as html by
hand.

"John Corliss" <(E-Mail Removed)#> wrote in message
news:(E-Mail Removed)...
> rth wrote:
>
> > any freeware to do pdf -> html ? especially pdf's with images and

tables /
> > columns in them ??

>
> Yesterday I looked and looked. Found one named "Clickcat PDF-to-HTML"
> which you should avoid because the free version has this limitation:
>
> "In the generated HTML code some letters are exchanged."
>
> Also, the download is 43 mb!
>
> I did, however, find PDF2HTML at Snapfiles:
>
> http://www.snapfiles.com/get/pdf2html.html
>
> Note that it's a command line too though. Homepage is here:
>
> http://sourceforge.net/projects/pdftohtml/
>
> --
> Regards from John Corliss
> No adware, cdware, commercial software, crippleware, demoware,
> nagware, shareware, spyware, time-limited software, trialware, viruses
> or warez please.



 
Reply With Quote
 
John Corliss
Guest
Posts: n/a
 
      29th Sep 2004
rth wrote:
> John Corliss wrote:
>> rth wrote:
>>
>>> any freeware to do pdf -> html ? especially pdf's with images and
>>> tables / columns in them ??

>>
>> Yesterday I looked and looked. Found one named "Clickcat PDF-to-HTML"
>> which you should avoid because the free version has this limitation:
>>
>> "In the generated HTML code some letters are exchanged."
>>
>> Also, the download is 43 mb!
>> I did, however, find PDF2HTML at Snapfiles:
>>
>> http://www.snapfiles.com/get/pdf2html.html
>>
>> Note that it's a command line too though. Homepage is here:
>>
>> http://sourceforge.net/projects/pdftohtml/

>
> thanks John.. got the whole bit, plus the gui and ghost but it didn't do a
> very good job...


Sorry that didn't work for you.

> I'd as soon settle for something that'd pull the text out
> as plain text and find some other way of sucking out the pictures... if I
> could get that stuff separately then I could re-author the doc as html by
> hand.


You can pull the text out by selecting and copying it right inside
Acrobat Reader (I use version 5.0.) It even retains some of the
formatting if you paste into Wordpad instead of Notepad. As for the
images, that would take using your "Printscreen" button and then
pasting the clipboard into something like PhotoFiltre. Next, either
crop the resulting images to remove the rest of the page in the .pdf
file you copied from, or select the images you want from the paste
then copy, paste and save *those* as new images in the image editor
you're using. You can then combine the images and the text in your
favorite HTML editor to make the page.

Note though, that copying images in this way may lead to a reduction
in image quality.

HTH

--
Regards from John Corliss
No adware, cdware, commercial software, crippleware, demoware,
nagware, shareware, spyware, time-limited software, trialware, viruses
or warez please.
 
Reply With Quote
 
hjmler
Guest
Posts: n/a
 
      29th Sep 2004
actually, what i finally wound up doing was to do a color print of the pdf
and then scanned it with Omnipage 14 to get the text, then scanned in the
images via Corel... then, with all the pieces, I used nvu to put it together
into html ... took the long way around <g> ...

"John Corliss" <(E-Mail Removed)#> wrote in message
news:(E-Mail Removed)...
> rth wrote:
> > John Corliss wrote:
> >> rth wrote:
> >>
> >>> any freeware to do pdf -> html ? especially pdf's with images and
> >>> tables / columns in them ??
> >>
> >> Yesterday I looked and looked. Found one named "Clickcat PDF-to-HTML"
> >> which you should avoid because the free version has this limitation:
> >>
> >> "In the generated HTML code some letters are exchanged."
> >>
> >> Also, the download is 43 mb!
> >> I did, however, find PDF2HTML at Snapfiles:
> >>
> >> http://www.snapfiles.com/get/pdf2html.html
> >>
> >> Note that it's a command line too though. Homepage is here:
> >>
> >> http://sourceforge.net/projects/pdftohtml/

> >
> > thanks John.. got the whole bit, plus the gui and ghost but it didn't do

a
> > very good job...

>
> Sorry that didn't work for you.
>
> > I'd as soon settle for something that'd pull the text out
> > as plain text and find some other way of sucking out the pictures... if

I
> > could get that stuff separately then I could re-author the doc as html

by
> > hand.

>
> You can pull the text out by selecting and copying it right inside
> Acrobat Reader (I use version 5.0.) It even retains some of the
> formatting if you paste into Wordpad instead of Notepad. As for the
> images, that would take using your "Printscreen" button and then
> pasting the clipboard into something like PhotoFiltre. Next, either
> crop the resulting images to remove the rest of the page in the .pdf
> file you copied from, or select the images you want from the paste
> then copy, paste and save *those* as new images in the image editor
> you're using. You can then combine the images and the text in your
> favorite HTML editor to make the page.
>
> Note though, that copying images in this way may lead to a reduction
> in image quality.
>
> HTH
>
> --
> Regards from John Corliss
> No adware, cdware, commercial software, crippleware, demoware,
> nagware, shareware, spyware, time-limited software, trialware, viruses
> or warez please.



 
Reply With Quote
 
 
 
Reply

Thread Tools
Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
convert html string into a html document csgraham74 Microsoft VB .NET 4 22nd Sep 2006 02:46 AM
Convert HTML String to HTML Document And Save csgraham74 Microsoft ASP .NET 2 19th Sep 2006 09:07 AM
Convert html to jpeg, tiff, gif and png with HTML Snapshot ActiveX Zhao Sheng Microsoft C# .NET 0 9th Oct 2004 12:31 PM
Convert html to jpeg, tiff, gif and png with Html To Image Zhao Sheng Microsoft C# .NET 0 9th Oct 2004 12:31 PM
Convert HTML to XML or Paser HTML Q.Z. Microsoft ASP .NET 6 12th Feb 2004 07:15 PM


Features
 

Advertising
 

Newsgroups
 


All times are GMT +1. The time now is 08:13 AM.