OK, where'd you hide my freekin' Regex html -> text extractor?

B

_BNC

I've been looking for a couple weeks for a regex expression that will
extract text from html in a form that will look like IE screen output.
I'm sure one of you guys hid it somewhere as a joke, but it's not funny
any more. Give it up, OK?

Seriously, I've checked regexlib.com and regular-expressions.info and lots
of google searches. I find tag extractors and other related stuff but
haven't come up with the 'visible text' extractor.

The other way I could do this is to bring up a website (ala scraper) and
somehow do the programatic equivalent of 'mouse drag capturing' the
visible text into the clipboard. Not sure if that's easy to do though.
 
C

Chris Hyde

_BNC said:
The other way I could do this is to bring up a website (ala scraper) and
somehow do the programatic equivalent of 'mouse drag capturing' the
visible text into the clipboard. Not sure if that's easy to do though.

There is actually an easier way...use a tool like SgmlReader to convert
the HTML into XHTML, and you can then "query" for the data you want
using XPath expressions...

<a
href="http://www.gotdotnet.com/Community/...mpleGuid=B90FDDCE-E60D-43F8-A5C4-C3BD760564BC">SgmlReader</a>

HTH...

Chris
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top