regular expression replace src attribute in image tag

  • Thread starter Thread starter FabSW
  • Start date Start date
F

FabSW

hi all, i've to replace the src tag in a html file with regex,

images tags look like this :

<td valign="top" align="middle" width="74"><img height="40"
src="1_interr.gif" alt="1_interr.gif" width="40" /></td>

what i want to do is to match form <img.... to... /> but backtrack
only 1_interr.gif

i tried using lookbehind and lookahead assertions with this pattern

(?<=<img.*?src=.*?")(?<src>([^"]*?))(?=".*?>)

but i returns 3 matches :

1_interr.gif
width=
40

i need only first one

what's wrong ?

thanks.
 
FabSW said:
hi all, i've to replace the src tag in a html file with regex,

images tags look like this :

<td valign="top" align="middle" width="74"><img height="40"
src="1_interr.gif" alt="1_interr.gif" width="40" /></td>

what i want to do is to match form <img.... to... /> but backtrack
only 1_interr.gif

i tried using lookbehind and lookahead assertions with this pattern

(?<=<img.*?src=.*?")(?<src>([^"]*?))(?=".*?>)

but i returns 3 matches :

1_interr.gif
width=
40

Funny. I would expect it to return 5 matches:

1_interr.gif
alt=
1_interr.gif
width=
40
i need only first one

what's wrong ?

thanks.

In the lookbehind you are allowing anything between src= and the
quotation mark. Just change src=.*?" to src="
 
FabSW said:
hi all, i've to replace the src tag in a html file with regex,
images tags look like this :
<td valign="top" align="middle" width="74"><img height="40"
src="1_interr.gif" alt="1_interr.gif" width="40" /></td>
what i want to do is to match form <img.... to... /> but backtrack
only 1_interr.gif
i tried using lookbehind and lookahead assertions with this pattern
(?<=<img.*?src=.*?")(?<src>([^"]*?))(?=".*?>)

but i returns 3 matches :
1_interr.gif
width=
40

Funny. I would expect it to return 5 matches:

1_interr.gif
alt=
1_interr.gif
width=
40
i need only first one
what's wrong ?

In the lookbehind you are allowing anything between src= and the
quotation mark. Just change src=.*?" to src="

Thanks it work almost in any file,
but it don't when there is white spaces or form feed

eg.

<i
mg .....

or

src
"

what's the way ?

thanks again.
 
FabSW said:
FabSW said:
hi all, i've to replace the src tag in a html file with regex,
images tags look like this :
<td valign="top" align="middle" width="74"><img height="40"
src="1_interr.gif" alt="1_interr.gif" width="40" /></td>
what i want to do is to match form <img.... to... /> but backtrack
only 1_interr.gif
i tried using lookbehind and lookahead assertions with this pattern
(?<=<img.*?src=.*?")(?<src>([^"]*?))(?=".*?>)
but i returns 3 matches :
1_interr.gif
width=
40
Funny. I would expect it to return 5 matches:

1_interr.gif
alt=
1_interr.gif
width=
40
i need only first one
what's wrong ?
thanks.
In the lookbehind you are allowing anything between src= and the
quotation mark. Just change src=.*?" to src="

Thanks it work almost in any file,
but it don't when there is white spaces or form feed

eg.

<i
mg .....

or

src
"

what's the way ?

thanks again.

You can specify to allow any number of white spaces between the items:

src\s*=\s*"
 
Back
Top