Excel Web Query does not return hyperlink

M

MikeA

With Office Professional Excel 2003, when I use a web query, some sites
return hyperlinks while others do not. I have set the option "Full HTML
formatting" on.

Examples: for http://www.yahoo.com, the hyperlinks show up in the worksheet,
but for
http://www.craigslist.com, they do not show up as hyperlinks.

I am not familiar enough with HTML to understand the differences in the web
sites.

Is there any workaround for this?

thanks, Mike
 
J

Joel

It all depends on the tools and how the webpage was created. You can always
write a macro like the one below to get the links

Sub Getchemicals2()

Found = False
For Each sht In Sheets
If sht.Name = "Chemicals" Then
Found = True
Exit For
End If
Next sht
If Found = False Then
Set ChemicalSht = Sheets.Add(after:=Sheets(Sheets.Count))
ChemicalSht.Name = "Chemicals"
Else
Set ChemicalSht = Sheets("Chemicals")
ChemicalSht.Cells.ClearContents
End If


Set ie = CreateObject("InternetExplorer.Application")
ie.Visible = True

URLFolder = _
"http://www.ilo.org/public/english/protection/safework/cis/products/icsc/dtasht/"
ChemicalRowCount = 1
For Letters = 0 To 25
AlphaLetter = Chr(Asc("a") + Letters)

URL = URLFolder & AlphaLetter & "_index.htm"

'get web page
ie.Navigate2 URL
Do While ie.readyState <> 4
DoEvents
Loop

Do While ie.busy = True
DoEvents
Loop

H2Found = False
For Each itm In ie.Document.all
If H2Found = False Then
If itm.tagname = "H2" Then
H2Found = True
End If
Else

If itm.tagname = "A" Then
If itm.innertext = "" Then Exit For

'chemical name
ChemicalSht.Range("A" & ChemicalRowCount) = itm.innertext
'webpage
ChemicalSht.Range("B" & ChemicalRowCount) = itm.href

ChemicalRowCount = ChemicalRowCount + 1
End If
End If
Next itm

Next Letters


End Sub
 
M

MikeA

Joel said:
It all depends on the tools and how the webpage was created. You can always
write a macro like the one below to get the links

Sub Getchemicals2()

Found = False
For Each sht In Sheets
If sht.Name = "Chemicals" Then
Found = True
Exit For
End If
Next sht
If Found = False Then
Set ChemicalSht = Sheets.Add(after:=Sheets(Sheets.Count))
ChemicalSht.Name = "Chemicals"
Else
Set ChemicalSht = Sheets("Chemicals")
ChemicalSht.Cells.ClearContents
End If


Set ie = CreateObject("InternetExplorer.Application")
ie.Visible = True

URLFolder = _
"http://www.ilo.org/public/english/protection/safework/cis/products/icsc/dtasht/"
ChemicalRowCount = 1
For Letters = 0 To 25
AlphaLetter = Chr(Asc("a") + Letters)

URL = URLFolder & AlphaLetter & "_index.htm"

'get web page
ie.Navigate2 URL
Do While ie.readyState <> 4
DoEvents
Loop

Do While ie.busy = True
DoEvents
Loop

H2Found = False
For Each itm In ie.Document.all
If H2Found = False Then
If itm.tagname = "H2" Then
H2Found = True
End If
Else

If itm.tagname = "A" Then
If itm.innertext = "" Then Exit For

'chemical name
ChemicalSht.Range("A" & ChemicalRowCount) = itm.innertext
'webpage
ChemicalSht.Range("B" & ChemicalRowCount) = itm.href

ChemicalRowCount = ChemicalRowCount + 1
End If
End If
Next itm

Next Letters


End Sub

Thanks for the comprehensive example. It really helps to get a complete
working example vice a few of the key lines. Anyway, I did get something
working, but is seems like a difficult solution to maintain/reapply. A few
questions:

Is there an easy way to find the tags used on the web page; I examined the
source, but due to poor formatting (no logical text wrap) it is easy to get
lost/confused. Is there a way to search for (for example) the second table.

Do you know what is missing from the page I am using that prevents the web
query from getting the links?

thanks again, Mike
 
J

Joel

This website has been down since the eveniong of the 23rd. it finally came
back up this morning.

I first look at the webpage using the Internet Explorer at use the menu
option VIEW - SOURCE which shows the source code in Notepad. The TAG items
start with a forward slash and ends with a 2nd forward slash ((may include
tag name

/abc ................ /

or

/abc ................ abc/

Tags can be nested and the slashes are always in pairs (an opening and
closing tag)


To get all the TAG info I usally use the code for SHEET1 below to get the
information.

You can get a trag name like in my example below for SHEET2

set MyTags = IE.document.getelementsbytagname("*")

or

set MyTags = IE.document.getelementsbytagname("abc")

You can also get classes which are shown in the source code as

name="top"></a><div id="msviMasthead"><table cellpadding="0"

The class name starts with id=

Which in this case is msviMasthead

Use this code to get the class

Set Class = IE.document.getElementById("msviMasthead")

See my code for SHEET3 below.



Sub test()

Set IE = CreateObject("InternetExplorer.Application")
IE.Visible = True

URL = "http://www.nissanusa.com/apps/dealerlocator"
Request = "?zipCode=07508&tool=Home.Locator"

'get web page
IE.Navigate2 URL & Request
Do While IE.readyState <> 4
DoEvents
Loop

With Sheets("Sheet1")
RowCount1 = 1
For Each itm In IE.document.all
.Range("A" & RowCount1) = itm.tagname
.Range("B" & RowCount1) = itm.ClassName
.Range("C" & RowCount1) = Left(itm.innertext, 1024)

RowCount1 = RowCount1 + 1
Next itm
End With

With Sheets("Sheet2")
RowCount2 = 1
For Each itm In IE.document.getelementsbytagname("*")
.Range("A" & RowCount2) = itm.tagname
.Range("B" & RowCount2) = itm.ClassName
.Range("C" & RowCount2) = Left(itm.innertext, 1024)

RowCount2 = RowCount2 + 1
Next itm
End With

With Sheets("Sheet3")
RowCount3 = 1
For RowCount = 1 To RowCount1 - 1
ClassName = Sheets("Sheet1").Range("B" & RowCount)
If ClassName <> "" Then
Set Class = IE.document.getElementById(ClassName)
If Not Class Is Nothing Then
.Range("A" & RowCount3) = Class.tagname
.Range("B" & RowCount3) = Class.ClassName
.Range("C" & RowCount3) = Left(Class.innertext, 1024)
RowCount3 = RowCount3 + 1
End If
End If
Next RowCount
End With

End Sub
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top