Code to get data from webpage into Excel

  • Thread starter Thread starter Guest
  • Start date Start date
G

Guest

I am trying to programmatically download, copy, import or any other way
cajole the contents of a specific web page onto an Excel Worksheet.

I have previously had some discussion with Dysgraphia so if he's still out
there apologies for this more specific re-posting.

I have tried:-
Dim webBk As Workbook
Set webBk=Workbooks.Open("http://www.somewebpage.com)

But I get a runtime error message saying MS Access cannot access the file
"http://www.somewebpage.com" because either it doesn't exist or is being used
by another program – neither of which is true. However the website does
require a UserName & Password and I think this is where this method falls
down.

So then I have tried:-
Set IE = CreateObject("InternetExplorer.Application")
With IE
.Visible = True
.Navigate "https://www.somewebsite.com"
End With

This brings up the site, and I can then manually enter the security details
and get to the required page. But I don't then know the code needed to
transfer it to Excel.

Also is there any programatical way round the problem of having to log in to
the site. I believe Excel provides the means of using a Personal Digital
Signature, but haven't a clue where to get one. This application is all
private.
A
ny suggestions would be most welcome.
 
Add a reference to "Microsoft HTML Object Library" to your VBA project then
try the following code (change yourURL to the desired web page):

Sub mshtmltest()
Dim objMSHTML As New MSHTML.HTMLDocument
Dim objDocument As MSHTML.HTMLDocument
Dim E As MSHTML.HTMLGenericElement

Set objDocument = objMSHTML.createDocumentFromUrl(yourURL, vbNullString)

While objDocument.readyState <> "complete"
DoEvents
Wend

For Each E In objDocument.all
If (E.tagName = "TD") Then
Debug.Print E.innerHTML
End If
Next
End Sub

As you can see, you can filter down to the HTML element level. In the above
example, I have filtered only "TD" elements. It would be even easier if
your table had a name or id. Then you can use:

objDocument.getElementById

or

objDocument.getElementByName

methods to directly access the HTML element that contains your data.
 
Thanks for the input Mark.
I'm ok down to the end of the Do Events loop.
Then, instead of extracting specific data from the resulting objDocument
(because it is too difficult - for me - to define specifics) , what I would
like to do is programatically put the whole objDocument onto a Worksheet as
if I had gone to the web page, selected all, copied all then pasted it.
 
You can past the HTML code onto your sheet, but Excel is not the best
program to render HTML onto a spreadsheet (unless you insert a browser
object onto your sheet). You can paste the text of the BODY element into a
cell on your sheet.

I think it might be helpful if you could describe the layout of the page you
are trying to paste onto a worksheet.

If it is simple text, you can use the following (after the Wend statement):

ActiveSheet.Cells(1, "A").Value = objDocument.body.innerText
 
Thanks Mark,
The webpage I wish to put onto a WorkSheet is itself formatted as a table.
So it renders well into Excel when I manually copy and paste it and sort of
"self parses". I can then clean it up easily with simple code to get just
what I need.
What I would like to do is programmatically automate the equivalent of a
copy and paste on the WHOLE page.

Your suggestion:-
ActiveSheet.Cells(1, "A").Value = objDocument.body.innerText

would be ideal if it could select the WHOLE objDocument, not just parts ie
"InnerText".
I'm not sure if this is possible: if not, is there some way to say
With objDocument
.Select All
.Copy
End With
Thanks
 
Hi,

My company uses a cgi-bin within its network to submit information wit
various input and then posts resulting data in IE under the same URL.
know the name of the table that I want to import after the 'results
page has loaded and I can progmattically submit the information neede
on the 'submit' screen, but the URL never changes. So when I try t
create a new query to import my data, it will only recognize the table
on the submit screen! Is there anyway to get around this? I've trie
waiting until the 'results' page has loaded and excel still does no
recognize the tableID that is right there in the source code. Pleas
help! I'm goin insane.

What I have so far is:

Sub webdata()
Dim ie As Object
Dim test As Range
Dim iedoc As HTMLDocument


On Error GoTo 1
Set ie = CreateObject("InternetExplorer.Application")
With ie
.navigate "http://natoth/cgi-bin/eac_project_summary.pl"
.Visible = True
Do While .Busy: DoEvents: Loop ' Loop until page is loaded
Do While .readyState <> 4: DoEvents: Loop
With .document.forms("F001") ' form name goes in ()
.project_id_pulldown.Value = proj 'all form values listed below
.xl_or_html.Value = "XL"
.dept_id.Value = "L" & dept
.output_format.Value = "EST"
.submit 'submit form values

End With

End With

<Here I need to import table("projectstatus") to Sheet1.rangeA1>


Set ie = Nothing
Exit Sub
1: MsgBox "Unexpected Error, sorry."

Set ie = Nothing



Best Regards,
Mjac
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Back
Top