Which class(es) should I use for an HTML Parser?

  • Thread starter Thread starter Jack
  • Start date Start date
J

Jack

How to quickly build an HTML parser with C#?
Does anything like "HTMLParser" exist?
Thanks in advance
Jack
 
How to quickly build an HTML parser with C#?
Does anything like "HTMLParser" exist?
Thanks in advance
Jack


I have found a HTML Parser class with C# written by Jeff Heaton
(http://www.jeffheaton.com).
It worked very well except not support Chinese character well.
The source code of parser class is in the the csspider project, and you
can get it from his website.
 
Jack,

The HtmlDocument class in the System.Windows.Forms namespace (God knows
why it's in this namespace) will allow you to get the document structure of
a document loaded through the WebBrowser control.

However, this is probably not what you want. You can always use MSHTML
through COM interop to parse the page without a UI component (it will be the
same engine as IE).
 
Back
Top