Strip html Tags

  • Thread starter Thread starter Guest
  • Start date Start date
G

Guest

Does anyone have an example of how to strip HTML tags from an laready existing text file

Thanks
 
AdamD,

I think that the easiest way to do this would be to load the file into
an instance of HTMLDocument (you can access it through COM interop by
setting a reference to mshtml in the project references, there should
already be a managed wrapper there). Once you have that, you can just get
the InnerText property of the tag representing the body, and it will return
the text, sans tags.

Hope this helps.
 
Many thanks.

Also found that I could use .NET Regular Expression and strip all content inside html tags <>.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Back
Top