Opening web documents as "text"

F

fisherofsouls

Hi,

I have a simple request which is driving me nuts ! I wonder if anyone
can help ?

My request is that I want to open HTML files on remote Web servers and
read the contents as text. That's it ! If I use the Workbooks.Open
Filename function, then Excel loads the page into a worksheet - this is
what I want to avoid. I would rather write some code to parse the HTML
text which makes up the content of the document. The Open function
just doesn't work on anything other than local files.

Essentially, I want to do exactly the same as the code snippet below,
but where sSource is:

"http://someserver.somewhere.com/example.htm"

rather than

"c:\example.htm"

Any help or advice would be greatly appreciated !

Regards

Nick

::CODE SNIPPET STARTS HERE
Sub Example_Code

Const sSource$ = "c:\example.htm"

Dim hFile as Integer

hFile = Freefile

Open sSource for Input Access Read Shared as #hFile

'Perform various operations on data held within file
Close #hFile

End Sub
::CODE SNIPPET ENDS HERE
 
T

Tom Ogilvy

Jake Marx Posted this:

=======================================================
Hi Witek,

Here's an example that should get you started. It doesn't work real well
with active content (ASP, etc.), but it should work for most sites. If
you want the HTML source instead of the viewable text produced by the HTML,
you can change the InnerText property to InnerHTML.


Regards,
Jake Marx


Sub Demo()
Dim ie As Object
Dim nFile As Integer
Set ie = CreateObject("InternetExplorer.Application")

With ie
.Visible = False
.Silent = True
.Navigate "www.yahoo.com"
Do Until Not .Busy
DoEvents
Loop
nFile = FreeFile
Open "D:\yahoo.txt" For Output Shared As #nFile
Print #nFile, .Document.DocumentElement.InnerText
Close #nFile
.Quit
End With
Set ie = Nothing
End Sub
===========================================================




Alyda Gilmore added this additional advice:
===========================================
Witek,


Further to Jake's most excellent example, may I suggest that you set a
reference to Microsoft Internet Controls (Shdocvw.dll) and use 'Dim ie As
SHDocVw.InternetExplorer' to declare the ie object. The beauty of this
approach is that you have all the IntelliSense features of the VBA Editor at
your disposal, including auto list members, syntax checking, parameter info,
quick info, and code formatting.
===========================================


So this write the text of the page to a file, but you could get the text
from

.Document.DocumentElement.InnerText

or

.Document.DocumentElement.InnerTextHTML

And parse it.
 
F

fisherofsouls

Tom,

Thanks for this.

If I'm going to use SHDocVw, I will presumably need to write a "Declare
Function" statement ?

Would you happen to have an example ?

Regards

Nick
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top