Interpret HTML <Script> in Visual Basic

J

Jarod_24

My program (it's a webcrawler) downloads htmlpages and in some cases the
html has <script> </script> tags in it that generates parts of the html.

How can i interpret this javascript and get the resulting html.

Im hoping for something very simple:

Dim page as string = downloadWebPage(www.page.com\index.htm) 'raw html
Dim newPage as string = interpreter.processScripts(page)

'do my parsing of the newPage string to get links.

anyone have any links to javascript interpreters for use in .net?


*** Free account sponsored by SecureIX.com ***
*** Encrypt your Internet usage with a free VPN account from http://www.SecureIX.com ***
 
J

jan.hancic

Google for SpiderMonkey. Maybe you can use that (you will have to make
unmanaged calls dough).
Or maybe JScript is what you are looking for.

Or the eaisest way would be to use the web browser control load the
page into the control and then get the generated HTML from there.


lp
Jan Hancic
http://cwizo.blogspot.com
 
R

Renze de Waal

Op Tue, 4 Apr 2006 21:03:08 +0200 schreef Jarod_24:
My program (it's a webcrawler) downloads htmlpages and in some cases the
html has <script> </script> tags in it that generates parts of the html.

How can i interpret this javascript and get the resulting html.

Im hoping for something very simple:

Dim page as string = downloadWebPage(www.page.com\index.htm) 'raw html
Dim newPage as string = interpreter.processScripts(page)

'do my parsing of the newPage string to get links.

anyone have any links to javascript interpreters for use in .net?


*** Free account sponsored by SecureIX.com ***
*** Encrypt your Internet usage with a free VPN account from http://www.SecureIX.com ***

You intend to write a program that reads webpages and executes all
(java)scripts on it? Do you know the possibilities of JavaScript well
enough to be able to realise the possible consequences of this? It seems
like a dangerous thing to do. I would advise you to reconsider...

Renze de Waal.
 
J

Jarod_24

Renze de Waal said:
Op Tue, 4 Apr 2006 21:03:08 +0200 schreef Jarod_24:


You intend to write a program that reads webpages and executes all
(java)scripts on it? Do you know the possibilities of JavaScript well
enough to be able to realise the possible consequences of this? It seems
like a dangerous thing to do. I would advise you to reconsider...

Renze de Waal.

I'm aware of the consequences. It's one of the reasons why i'd prefer not to
use the WebBrowser control that is included in Visual Studio 2005.
The control basically gives you a IE window in your app, with no ability to
control pop ups, activex'es and so on.

An Javascript interpreter (not java) is what i'm looking for. If that
interpreter had some ability to turn on/off specific features possible in
javascript then even better.

As far as i've understood, you cant give the Windows Scripting Host (WSH)
html and expect to get out the completed result, it needs a well written .js
script wich is nothing like the mish-mash of <html> and <script> tags and
everthing else that a webpage has.


*** Free account sponsored by SecureIX.com ***
*** Encrypt your Internet usage with a free VPN account from http://www.SecureIX.com ***
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top