Get data from web site - hyperlinks in divs in html table

  • Thread starter Thread starter Greg Lovern
  • Start date Start date
G

Greg Lovern

I need to download data from a web site. I'm familiar with how to use
an InternetExplorer object to navigate to the page etc. But how do I
get this data:


The web page source for the table of data I want looks like this:


<table border="0" cellpadding="15" cellspacing="0" width="580">
<tr>
<td class="copy">
<div class="mahead"><b>Item 1</b> -- <a href="http://
www.website.com/webpage.html">view page</a></div>
<div class="mahead"><b>Other Thing</b> -- <a href="http://
www.website.com/someotherpage.html">view page</a></div>
<div class="mahead"><b>New Stuff</b> -- <a href="http://
www.website.com/yetanotherpage.html">view page</a></div>
<br /><br /><br /><br /><br />
</td>
</tr>
</table>


Where my example shows only three items, the real web page has many
more, each as a <div>, all in the same <td>.

What I need to get is:

[Field 1] [Field 2]
Item1 http://www.website.com/webpage.html
Other Thing http://www.website.com/someotherpage.html
New Stuff http://www.website.com/yetanotherpage.html


How do I get that data?



BTW, if I use Excel's Get External Data From Web feature, I get this
table:

[Field 1] [Field 2]
Item1 view page
Other Thing view page
New Stuff view page

With no sign of the urls, which I need.



Thanks for any suggestions.

Greg
 
Greg

This will get all the links from that particular HTML code.

Private Sub GetAllLinks()
Dim IE As Object
Dim doc As Object
Dim lnks As Object
Dim l
Dim pos
Dim rng As Range

Set rng = Range("A1")
Set IE = CreateObject("InternetExplorer.Application")

IE.Visible = True

IE.Navigate "C:\GetLinksFromDivsTest.html"

Do While IE.Busy: DoEvents: Loop
Do While IE.ReadyState <> 4: DoEvents: Loop

Set doc = IE.Document

Set lnks = doc.getelementsbytagname("A")

For Each l In lnks

rng = l.innertext
rng.Offset(, 1) = l

Set rng = rng.Offset(1)
Next l

IE.Quit: Set IE = Nothing

End Sub

Similar code could be used to get all the tables, all the divs etc
 
Back
Top