Convert HTML to XML or Paser HTML

Q

Q.Z.

Hello,

Does anybody know is there a .NET or COM based library to
parse HTML or convert html to xml so I can use xpath to
parse it?

Thanks
Qin Zhou
 
S

Steven Cheng[MSFT]

Hi Q.Z,


Thank you for using Microsoft Newsgroup Service. Based on your description,
you are looking for some COM or dotnet components which can convert the
html document into XML (XHTML) style document. Is my understanding correct?

If so, I think Ken Cox've provided some good sites on this topic, they
shows two components of COM. You may have a try on them to see whether they
help.

Steven Cheng
Microsoft Online Support

Get Secure! www.microsoft.com/security
(This posting is provided "AS IS", with no warranties, and confers no
rights.)
 
M

Maxim Kazitov

If you load you page to WebBrowser control you can parse you page using DOM,
this is work slow, but works.


David Elliott said:
I have tried the SgmlReader but am having difficultly with some sites, such as www.msn.com

If I could find a way to do parsing on HTML using C/C++/C# I would be happy. All I really
need is a way to have an array of <tag> and <data>. Finer grainularity is not necessary. Just
the raw information. I do need the entire page though from opening <html>
to the closing said:
I would prefer an HTML to XML conversion, but as time is limited, any solution would be
appreciated.

Thanks,
Dave
 
G

George Ter-Saakov

Take a look
http://blogs.msdn.com/smourier/archive/2003/06/04/8265.aspx

George.

David Elliott said:
I have tried the SgmlReader but am having difficultly with some sites, such as www.msn.com

If I could find a way to do parsing on HTML using C/C++/C# I would be happy. All I really
need is a way to have an array of <tag> and <data>. Finer grainularity is not necessary. Just
the raw information. I do need the entire page though from opening <html>
to the closing said:
I would prefer an HTML to XML conversion, but as time is limited, any solution would be
appreciated.

Thanks,
Dave
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top