Need some help with a regular expression

  • Thread starter Thread starter User N
  • Start date Start date
U

User N

I need a re that matchs an HTML anchor element, extracting the
href and text, given some of the text. For example, given the
following two line input:

<a href="a">bogus</a><a href="b">foobar</a>
<a href="c"><b>foo</b>bar</a>

I want to match the anchors that contain "bar" in the text. A simple
attempt might look like:

<a\s+href=\"(.+?)\">(.*?bar)</a>

but that actually matches the entire first line. I think some kind of
negative lookahead is needed, but can't quite figure it out. Any ideas?
 
I didn't try, but maybe modifying it like this:

<a\s+href=\"([^\"]+)\">(.*?bar)</a>
 
Luc E. Mistiaen said:
I didn't try, but maybe modifying it like this:

<a\s+href=\"([^\"]+)\">(.*?bar)</a>

Thanks for the reply. What I came up with was this:

<a\s+href=\"([^\"]+)\">((?:.(?!</a>))*?bar)</a>
 
Back
Top