Webbrowser control help....

J

jim

Has anyone seen a way to read the data stream for a URL being navigated to
by a webbrowser control and edit it (i.e. search HTML stream for and
possibly remove objectionable language for a kid's browser) before it's
displayed in the browser?

I'd also like to know if there is a way to drag and select portions of a
page in the webbrowser control (like cells, tables, divs, images - any
object on the page really) by clicking on the page displayed in the
webbrowser control and dragging the mouse - like you can do when selecting
multiple items on your desktop. The goal is to highlight the items that are
being selected and add their objects to an array for copying to my local
drive.

I'm new to the webbrowser control in .Net and I really don't see the
functionality that we used to have with the old shwdoc.dll stuff in VB6.
SO, any help with any code that enhances the control of the .net webbrowser
control beyond what is inherent in the control would be greatly appreciated.

jim
 
P

Peter Duniho

(I don't think it's a good idea to cross-post to two completely different
language groups...VB has been removed in this post to the C# newsgroup. I
grudgingly left "general" :) )

Has anyone seen a way to read the data stream for a URL being navigated
to
by a webbrowser control and edit it (i.e. search HTML stream for and
possibly remove objectionable language for a kid's browser) before it's
displayed in the browser?

You can use the WebBrowser.DocumentStream or WebBrowser.DocumentText
properties to explicitly set HTML for the browser window. You can use the
HttpWebRequest to get a stream from a URL that you can attach to the
WebBrowser. In your case, you'd want to insert your own stream in
between, that filters as it reads the data and passes it along to the
WebBrowser. Or you could just download the entire page all at once
yourself, filter through the text, and then set the text for the
WebBrowser.

I haven't looked closely at how you'd need to intercept clicks on links in
order to allow regular navigation to happen smoothly (when you set those
properties, the page address is actually "about:Blank", which I'd guess
would mess up any relative links). But I suspect it's doable somehow.
Probably it involves intercepting the links and redirecting them and/or
replacing links in the document with absolute links (which I guess you'd
still need to intercept to ensure that you still provide a filtered page
for newly clicked links).

Now, all that said...I think text is the least of your worries, unless you
are going to disable images, audio, and video altogether. It seems to me
that it might make more sense to, rather than trying to manipulate the
HTML yourself, rely on IE's built-in content filtering. Either require
the user (parent) to ensure that it's properly configured, or do some sort
of automatic configuration on installation. You would probably be better
off focusing on the "skinning" aspect of the project, rather than getting
mired down in the content filtering stuff. That's a real rat hole, and
IE's already got a lot of stuff in there to try to deal with that
(granted, it doesn't have a language filter AFAIK, but again...the least
of your worries :) ).

Pete
 
N

Nicholas Paldino [.NET/C# MVP]

In addition to what Peter said, it would seem that you have code already
that does a great deal of what you want (you made the comment "I really
don't see the functionality that we used to have with the old shwdoc.dll
stuff in VB6"). The WebBrowser control is just a managed wrapper around the
control exposed from shwdoc.dll. You can get the ActiveX instance through
the ActiveXInstance property and do all the things with that that you did in
VB6 (and more, really, since there were a number of interfaces that weren't
able to be used in VB6).

Also, if you want to filter the content that comes into the web browser,
you really want to create a pluggable MIME filter. It will require a good
deal of interop, but you can register this for your app and then the content
going through the WebBrowser control will be able to be modified as it goes
to it.

There are a good deal of suggestions indicating that you should use
HttpWebRequest to load the content yourself and then use the DocumentStream
property on the document exposed by the WebBrowser to populate it. The
thing is, it shoots relative URLs to hell (since it doesn't know what the
base URL is) and will really only help if you have control of the content on
the server side (as you can avoid creating content that will fail when not
referencing a base URL).

Creating a MIME filter won't cause that problem. Of course, the MIME
type you would filter would probably be anything that is "text/*"
 
C

Cor Ligthert[MVP]

Hi Nicholas,

I have read this message earlier, however thought I have never heard of a
shwdoc.dll that would have been in VB6. (I know the shdocvw.dll, but the
characters are so much different that it cannot be that). However was not
that currious to that. Moreover because that can still be used in every
version of Net. (AxWebBrowser).

Howewer now you even know what it is, can you give me a link to it.

Cor
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top