That's the thing, the javascript has to store the URL somewhere, and
that is usually an element in the HTML (even if it is dynamic). Even if it
is stored in some array in memory and then accessed in a click handler or
something like that, you will have to execute the javascript in the page and
then access the script parser in order to get those values, which MSHTML
will allow you to do.
--
- Nicholas Paldino [.NET/C# MVP]
- (e-mail address removed)
Logician,
Right, and using MSHTML, you would load the page, and then your bot
would manipulate the DOM to set the appropriate input. Once you do that,
the HTML engine will interpret the javascript and then you can access the
DOM to get whatever elements were changed to have the new URL.
--
- Nicholas Paldino [.NET/C# MVP]
- (e-mail address removed)
On Sep 18, 4:00 pm, "Nicholas Paldino [.NET/C# MVP]"
Logician,
If this is in the context of an HTML page, then why not use an
HTML
document host like MSHTML to execute the javascript? You can then
access
the document object model (DOM) after the javascript is executed, as
well
as
set values on the page and see how the page reacts.
I'm not sure how that works. It is an HTML page but generated from
user input, eg a menu click. The bot has to read the js code and then
intrepret it to get the new url, then effectively visit the url. But
as the url is built at run time this is fairly hard.
I checked out some sites and saw, eg grohe.co.uk, that not even google
has indexed much of those sites.
--
- Nicholas Paldino [.NET/C# MVP]
- (e-mail address removed)
Does anyone have any idea how this is done?
I am writing a c# bot to grab data from sites, but some sites have
extensive Javascript navigation. This means I have to read the
script
and effectively run it within c#.
I have one example from a book (Http programming for bots using c#).
The problem I have is understanding how to setup the package and
then
how to process the Javascript code on the site without actually in
effect copying the code.
JScript.Eval E = new JScript.Eval();
String expression = TextBox.Text;
try
{
TextBox1.Text = E.DoEval(expression);
}
JScript is defined as:
package JScript
{
class Eval
{
public function DoEval(expr:String):String
{
return eval(expr);
}
}
}- Hide quoted text -
- Show quoted text -- Hide quoted text -
- Show quoted text -
Many of the sites do not have URL's as such, they are effectively
opening JS windows on the client and then using data acess to built up
data. So the bot needs to simulate the user input and then read the
new screen. This is different to finding a new URL which assumes the
data is stored as HTML (or other markup) somewhere statically and
needs to be accessed. The data is stored in a database only, and
everything is generated.- Hide quoted text -
- Show quoted text -