Parsing HTML

G

Guest

Hi,

I have a requirement where in I need to parse and get all the HTML and
custom specific tags from the HTML page.

I'm using mshtml and vb.net. One way i thought off was to use the regex
functions. Can I know what is the best way to parse and fetch all the tags or
if any one has worked on already can help me out so that I don't need to
reinvent the wheel.

Thanks in Advance.


Rgds,
 
H

Herfried K. Wagner [MVP]

Sindbaad said:
I have a requirement where in I need to parse and get all the HTML and
custom specific tags from the HTML page.

I'm using mshtml and vb.net.

What doesn't work with MSHTML?
One way i thought off was to use the regex
functions. Can I know what is the best way to parse and fetch all the tags
or
if any one has worked on already can help me out so that I don't need to
reinvent the wheel.

..NET Html Agility Pack: How to use malformed HTML just like it was
well-formed XML...
<URL:http://blogs.msdn.com/smourier/archive/2003/06/04/8265.aspx>

Download:

<URL:http://www.codefluent.com/smourier/download/htmlagilitypack.zip>
 
G

Guest

Thanks for the update.

There are no problems with the MSHTML, I havn;t tried much with it. So I
just wanted to know further information on how to proceed on this.

The requirement is as below:
From the HTML page, I would to fetch all the custom tags in the order its
there in the HTML page.

eg:
<TR>
<TD class=smallText style="LINE-HEIGHT: 1.5" vAlign=top>
<FIRSTNAME>fstName</FIRSTNAME>
<LASTNAME>lstname</LASTNAME> <BR>
<ADDRESS1>addr</ADDRESS1><BR>
</TD></TR>

From the above I need to fetch only Firstname, lastname, address1.

Any help is really appreciated.

Rgds,
Sindbaad
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top