getting data from string

  • Thread starter Thread starter Guest
  • Start date Start date
G

Guest

Hi all,

I have a string containing html code and want to extract data between action
en/or href. For example: href="www.google.com" then I want to detect the href
string and get the data between "" (www.google.com).

How is this best done,

Regards
Stijn
 
Hi Stijn,

I would say the best method to extract strings from strings would be by
using Regular Expressions.

Since, in this case you need the value in between the quotes, you could use
the :q construct.

HTH,
Rakesh Rajan
 
Here is a regex that I wrote some time ago to find image references. You
should be able to adapt it for use in finding anchors. It is somewhat
complex and it has been awhile since I have used it so it probably has some
bugs in it, but this should get you started.

<img.+?src\s*=\s*((["](?<xref>[^"]+)["][^>]*[>])|(['](?<xref>[^']+)['][^>]*[>])|((?<xref>[^\\s>]+)[^>]*[>]))
 
Stijn,

For that is in dotnet the MSHTML class what covers the DOM.

However it is very hard to use (it has an endless amount of interfaces),
therefore take a lot of people probably the regex.

Just my thought,

Cor
 
Back
Top