PC Review


Reply
Thread Tools Rate Thread

Convert .rtf or .doc or .pdf or .htm to plain txt

 
 
Dave
Guest
Posts: n/a
 
      28th Jan 2005
Greetings,

Is anybody aware of any code that will allow me to read .rtf or .doc or .pdf
or .htm as plain text (so I can do a streamreader off them). Thanks,

-Dave


 
Reply With Quote
 
 
 
 
David Browne
Guest
Posts: n/a
 
      28th Jan 2005

"Dave" <(E-Mail Removed)> wrote in message
news:uEOED%(E-Mail Removed)...
> Greetings,
>
> Is anybody aware of any code that will allow me to read .rtf or .doc or
> .pdf or .htm as plain text (so I can do a streamreader off them). Thanks,
>


Each format would require a different tool. Microsoft Word can do .rtf and,
of course, .doc.

But for PDF check out the pdftotext.exe from the XPDF library

http://www.foolabs.com/xpdf/download.html

from their web site:

"Xpdf is an open source viewer for Portable Document Format (PDF) files.
(These are also sometimes also called 'Acrobat' files, from the name of
Adobe's PDF software.) The Xpdf project also includes a PDF text extractor,
PDF-to-PostScript converter, and various other utilities.

Xpdf runs under the X Window System on UNIX, VMS, and OS/2. The non-X
components (pdftops, pdftotext, etc.) also run on Win32 systems and should
run on pretty much any system with a decent C++ compiler. "


It's a commandline tool so you would need to shell out to it, and then open
a streamreader against the output file.

David



 
Reply With Quote
 
Beringer
Guest
Posts: n/a
 
      28th Jan 2005
As a related topic:
Does anybody know of code examples on how to convert RTF to HTML, XML etc?

Thanks in advance,
Eric

"David Browne" <davidbaxterbrowne no potted (E-Mail Removed)> wrote in
message news:(E-Mail Removed)...
>
> "Dave" <(E-Mail Removed)> wrote in message
> news:uEOED%(E-Mail Removed)...
>> Greetings,
>>
>> Is anybody aware of any code that will allow me to read .rtf or .doc or
>> .pdf or .htm as plain text (so I can do a streamreader off them).
>> Thanks,
>>

>
> Each format would require a different tool. Microsoft Word can do .rtf
> and, of course, .doc.
>
> But for PDF check out the pdftotext.exe from the XPDF library
>
> http://www.foolabs.com/xpdf/download.html
>
> from their web site:
>
> "Xpdf is an open source viewer for Portable Document Format (PDF) files.
> (These are also sometimes also called 'Acrobat' files, from the name of
> Adobe's PDF software.) The Xpdf project also includes a PDF text
> extractor, PDF-to-PostScript converter, and various other utilities.
>
> Xpdf runs under the X Window System on UNIX, VMS, and OS/2. The non-X
> components (pdftops, pdftotext, etc.) also run on Win32 systems and should
> run on pretty much any system with a decent C++ compiler. "
>
>
> It's a commandline tool so you would need to shell out to it, and then
> open a streamreader against the output file.
>
> David
>
>
>



 
Reply With Quote
 
Dave
Guest
Posts: n/a
 
      28th Jan 2005
David,

This tool from Foolabs does exactly what I was looking for. I am looking to
use it, though, in the .NET Compact Framework. Is there a way to do that?

-Dave

"David Browne" <davidbaxterbrowne no potted (E-Mail Removed)> wrote in
message news:(E-Mail Removed)...
>
> "Dave" <(E-Mail Removed)> wrote in message
> news:uEOED%(E-Mail Removed)...
>> Greetings,
>>
>> Is anybody aware of any code that will allow me to read .rtf or .doc or
>> .pdf or .htm as plain text (so I can do a streamreader off them).
>> Thanks,
>>

>
> Each format would require a different tool. Microsoft Word can do .rtf
> and, of course, .doc.
>
> But for PDF check out the pdftotext.exe from the XPDF library
>
> http://www.foolabs.com/xpdf/download.html
>
> from their web site:
>
> "Xpdf is an open source viewer for Portable Document Format (PDF) files.
> (These are also sometimes also called 'Acrobat' files, from the name of
> Adobe's PDF software.) The Xpdf project also includes a PDF text
> extractor, PDF-to-PostScript converter, and various other utilities.
>
> Xpdf runs under the X Window System on UNIX, VMS, and OS/2. The non-X
> components (pdftops, pdftotext, etc.) also run on Win32 systems and should
> run on pretty much any system with a decent C++ compiler. "
>
>
> It's a commandline tool so you would need to shell out to it, and then
> open a streamreader against the output file.
>
> David
>
>
>



 
Reply With Quote
 
Matt Berther
Guest
Posts: n/a
 
      28th Jan 2005
Hello Beringer,

Im not completely sure about this, but vwWare[1] may do what you need.

[1] http://wvware.sourceforge.net/

--
Matt Berther
http://www.mattberther.com

> As a related topic:
> Does anybody know of code examples on how to convert RTF to HTML, XML
> etc?
> Thanks in advance,
> Eric
> "David Browne" <davidbaxterbrowne no potted (E-Mail Removed)> wrote in
> message news:(E-Mail Removed)...
>
>> "Dave" <(E-Mail Removed)> wrote in message
>> news:uEOED%(E-Mail Removed)...
>>
>>> Greetings,
>>>
>>> Is anybody aware of any code that will allow me to read .rtf or .doc
>>> or .pdf or .htm as plain text (so I can do a streamreader off them).
>>> Thanks,
>>>

>> Each format would require a different tool. Microsoft Word can do
>> .rtf and, of course, .doc.
>>
>> But for PDF check out the pdftotext.exe from the XPDF library
>>
>> http://www.foolabs.com/xpdf/download.html
>>
>> from their web site:
>>
>> "Xpdf is an open source viewer for Portable Document Format (PDF)
>> files. (These are also sometimes also called 'Acrobat' files, from
>> the name of Adobe's PDF software.) The Xpdf project also includes a
>> PDF text extractor, PDF-to-PostScript converter, and various other
>> utilities.
>>
>> Xpdf runs under the X Window System on UNIX, VMS, and OS/2. The non-X
>> components (pdftops, pdftotext, etc.) also run on Win32 systems and
>> should run on pretty much any system with a decent C++ compiler. "
>>
>> It's a commandline tool so you would need to shell out to it, and
>> then open a streamreader against the output file.
>>
>> David
>>




 
Reply With Quote
 
David Browne
Guest
Posts: n/a
 
      29th Jan 2005

"Dave" <(E-Mail Removed)> wrote in message
news:%(E-Mail Removed)...
> David,
>
> This tool from Foolabs does exactly what I was looking for. I am looking
> to use it, though, in the .NET Compact Framework. Is there a way to do
> that?
>


It's not managed code: It's a platform binary compiled in C++. It might
run, or you might be able to compile it for your platform.

David


 
Reply With Quote
 
 
 
Reply

Thread Tools
Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
convert rtf to plain text tinique Microsoft Access ADP SQL Server 1 12th Sep 2007 06:13 PM
How should I convert Rtf to plain text? =?Utf-8?B?UElFQkFMRA==?= Microsoft Dot NET Framework 3 27th Jun 2006 03:35 PM
Convert .rtf or .doc or .pdf or .htm to plain txt Dave Microsoft C# .NET 1 28th Jan 2005 02:32 PM
Convert 00:00:00 to a plain number xadamz23 Microsoft Excel Misc 3 13th Jul 2004 10:16 PM
how do you convert formulas to just plain test? craig Microsoft Excel Misc 2 4th Mar 2004 12:15 AM


Features
 

Advertising
 

Newsgroups
 


All times are GMT +1. The time now is 11:14 AM.