PC Review


Reply
Thread Tools Rate Thread

Regex Replace Help

 
 
barry
Guest
Posts: n/a
 
      29th Apr 2008
Hi

I have a files which contains
&
&
&quote;

I want to replace & with & , but not & or &quote;

Will someone please help with the Regular Expression.

TIA
Barry



 
Reply With Quote
 
 
 
 
barry
Guest
Posts: n/a
 
      30th Apr 2008
strange, no one has replied

looks like i have crossed the limit of asking question on the same topic. I
the some limit maybe (10 or 15 per topic)


"barry" <(E-Mail Removed)> wrote in message
news:(E-Mail Removed)...
> Hi
>
> I have a files which contains
> &
> &amp;
> &quote;
>
> I want to replace & with &amp; , but not &amp; or &quote;
>
> Will someone please help with the Regular Expression.
>
> TIA
> Barry
>
>
>



 
Reply With Quote
 
 
 
 
eBob.com
Guest
Posts: n/a
 
      30th Apr 2008
Hi Barry,

Actually I wanted to play around with this for you but just haven't gotten
around to it. (And, btw, I don't recall any previous posts from you on this
subject.)

Part of my response to you, the part I can provide without actually playing
with a regex, is the following. Regular Expressions are extremely useful.
If you do any programming the effort you put into learning regular
expressions will be worth it. Several of us here use Expresso (from
UltraPico) and recommend it. I just became aware of something similar
called Regular Expression Workbench available from MSDN. I've installed it
but have not yet played with it.

I'll try to play with it later today but no promises.

Bob

"barry" <(E-Mail Removed)> wrote in message
news:(E-Mail Removed)...
> strange, no one has replied
>
> looks like i have crossed the limit of asking question on the same topic.
> I the some limit maybe (10 or 15 per topic)
>
>
> "barry" <(E-Mail Removed)> wrote in message
> news:(E-Mail Removed)...
>> Hi
>>
>> I have a files which contains
>> &
>> &amp;
>> &quote;
>>
>> I want to replace & with &amp; , but not &amp; or &quote;
>>
>> Will someone please help with the Regular Expression.
>>
>> TIA
>> Barry
>>
>>
>>

>
>



 
Reply With Quote
 
barry
Guest
Posts: n/a
 
      30th Apr 2008

Thanks for your reply

Imagine the following string
string str = "The Quick Black&Fox &amp; Jumped Over &quote; The & Lazy Dog";

should be

string str = "The Quick Black&amp;Fox &amp; Jumped Over &quote; The & Lazy
Dog";

This is a problem with a larger .xml file in which xx&xx is creating a
problems in IE

In fact in have just spent over 50 minutes and managed to get some results
like this

str = Regex.Replace(str, @"\b\s*(?=&[^&amp;|&quote;| & ])\b", "&amp;",
RegexOptions.None);

And last but not the least i collect all answers posted to my Regex queries
for later use.








"eBob.com" <(E-Mail Removed)> wrote in message
news:(E-Mail Removed)...
> Hi Barry,
>
> Actually I wanted to play around with this for you but just haven't gotten
> around to it. (And, btw, I don't recall any previous posts from you on
> this subject.)
>
> Part of my response to you, the part I can provide without actually
> playing with a regex, is the following. Regular Expressions are extremely
> useful. If you do any programming the effort you put into learning regular
> expressions will be worth it. Several of us here use Expresso (from
> UltraPico) and recommend it. I just became aware of something similar
> called Regular Expression Workbench available from MSDN. I've installed
> it but have not yet played with it.
>
> I'll try to play with it later today but no promises.
>
> Bob
>
> "barry" <(E-Mail Removed)> wrote in message
> news:(E-Mail Removed)...
>> strange, no one has replied
>>
>> looks like i have crossed the limit of asking question on the same topic.
>> I the some limit maybe (10 or 15 per topic)
>>
>>
>> "barry" <(E-Mail Removed)> wrote in message
>> news:(E-Mail Removed)...
>>> Hi
>>>
>>> I have a files which contains
>>> &
>>> &amp;
>>> &quote;
>>>
>>> I want to replace & with &amp; , but not &amp; or &quote;
>>>
>>> Will someone please help with the Regular Expression.
>>>
>>> TIA
>>> Barry
>>>
>>>
>>>

>>
>>

>
>



 
Reply With Quote
 
eBob.com
Guest
Posts: n/a
 
      1st May 2008
Well ... I did know before I looked at this that I was far from an expert in
regular expressions. But I know that even better now! But I think that I
did eventually find a solution.

I began by ignoring the replace aspect of the problem and tried to just find
a regular expression that would find the right ampersands. At first I did
not see how to find an "&" NOT followed by a specific STRING. But after
some research in the great Balena book I learned that I could use what is
called a "zero-width negative look-ahead assertion". The syntax for one of
those is "(?!subexpr)". So using this expression
(?<desiredamp>&(?!amp)
I was able to find ampersands except for those followed by "amp;". Great!
I thought I was on my way and leapt, without sufficient thought, to
(?<desiredamp>&((?!amp|(?!quote))
but that catches all ampersands. It catches &amp; because that's an & not
followed by "quote;". And catches &quote; because that's an & not followed
by "amp;".

After more thought I came up with
(?<desiredamp>&(?!(amp|(quote))
which I think finds the ampersands which you want to find.

Another plug for Expresso, it was absolutely invaluable in researching this!

The expression above does find the & in "The & Lazy Dog". But if you don't
want that one I am sure you can see how to alter the expression to eliminate
it. I also did not worry about the replace aspect of the problem, I'm sure
you don't need help with that.

Good Luck, Bob

"barry" <(E-Mail Removed)> wrote in message
news:%(E-Mail Removed)...
>
> Thanks for your reply
>
> Imagine the following string
> string str = "The Quick Black&Fox &amp; Jumped Over &quote; The & Lazy
> Dog";
>
> should be
>
> string str = "The Quick Black&amp;Fox &amp; Jumped Over &quote; The & Lazy
> Dog";
>
> This is a problem with a larger .xml file in which xx&xx is creating a
> problems in IE
>
> In fact in have just spent over 50 minutes and managed to get some results
> like this
>
> str = Regex.Replace(str, @"\b\s*(?=&[^&amp;|&quote;| & ])\b", "&amp;",
> RegexOptions.None);
>
> And last but not the least i collect all answers posted to my Regex
> queries for later use.
>
>
>
>
>
>
>
>
> "eBob.com" <(E-Mail Removed)> wrote in message
> news:(E-Mail Removed)...
>> Hi Barry,
>>
>> Actually I wanted to play around with this for you but just haven't
>> gotten around to it. (And, btw, I don't recall any previous posts from
>> you on this subject.)
>>
>> Part of my response to you, the part I can provide without actually
>> playing with a regex, is the following. Regular Expressions are
>> extremely useful. If you do any programming the effort you put into
>> learning regular expressions will be worth it. Several of us here use
>> Expresso (from UltraPico) and recommend it. I just became aware of
>> something similar called Regular Expression Workbench available from
>> MSDN. I've installed it but have not yet played with it.
>>
>> I'll try to play with it later today but no promises.
>>
>> Bob
>>
>> "barry" <(E-Mail Removed)> wrote in message
>> news:(E-Mail Removed)...
>>> strange, no one has replied
>>>
>>> looks like i have crossed the limit of asking question on the same
>>> topic. I the some limit maybe (10 or 15 per topic)
>>>
>>>
>>> "barry" <(E-Mail Removed)> wrote in message
>>> news:(E-Mail Removed)...
>>>> Hi
>>>>
>>>> I have a files which contains
>>>> &
>>>> &amp;
>>>> &quote;
>>>>
>>>> I want to replace & with &amp; , but not &amp; or &quote;
>>>>
>>>> Will someone please help with the Regular Expression.
>>>>
>>>> TIA
>>>> Barry
>>>>
>>>>
>>>>
>>>
>>>

>>
>>

>
>



 
Reply With Quote
 
Jesse Houwing
Guest
Posts: n/a
 
      2nd May 2008
Hello Barry,

Let's try to wriete out the pattern you're looking for.

You're specifically looking for a pattern that consists of an &, not followed
by any alphanumeric characters and a ;. Now if you write it out like that,
it becomes quite simple:

&(?![a-z0-9]+

That's it... Now, replace those with &amp; and you're done.

Jesse


> Thanks for your reply
>
> Imagine the following string
> string str = "The Quick Black&Fox &amp; Jumped Over &quote; The & Lazy
> Dog";
> should be
>
> string str = "The Quick Black&amp;Fox &amp; Jumped Over &quote; The &
> Lazy Dog";
>
> This is a problem with a larger .xml file in which xx&xx is creating
> a problems in IE
>
> In fact in have just spent over 50 minutes and managed to get some
> results like this
>
> str = Regex.Replace(str, @"\b\s*(?=&[^&amp;|&quote;| & ])\b", "&amp;",
> RegexOptions.None);
>
> And last but not the least i collect all answers posted to my Regex
> queries for later use.
>
> "eBob.com" <(E-Mail Removed)> wrote in message
> news:(E-Mail Removed)...
>
>> Hi Barry,
>>
>> Actually I wanted to play around with this for you but just haven't
>> gotten around to it. (And, btw, I don't recall any previous posts
>> from you on this subject.)
>>
>> Part of my response to you, the part I can provide without actually
>> playing with a regex, is the following. Regular Expressions are
>> extremely useful. If you do any programming the effort you put into
>> learning regular expressions will be worth it. Several of us here
>> use Expresso (from UltraPico) and recommend it. I just became aware
>> of something similar called Regular Expression Workbench available
>> from MSDN. I've installed it but have not yet played with it.
>>
>> I'll try to play with it later today but no promises.
>>
>> Bob
>>
>> "barry" <(E-Mail Removed)> wrote in message
>> news:(E-Mail Removed)...
>>
>>> strange, no one has replied
>>>
>>> looks like i have crossed the limit of asking question on the same
>>> topic. I the some limit maybe (10 or 15 per topic)
>>>
>>> "barry" <(E-Mail Removed)> wrote in message
>>> news:(E-Mail Removed)...
>>>
>>>> Hi
>>>>
>>>> I have a files which contains
>>>> &
>>>> &amp;
>>>> &quote;
>>>> I want to replace & with &amp; , but not &amp; or &quote;
>>>>
>>>> Will someone please help with the Regular Expression.
>>>>
>>>> TIA
>>>> Barry

--
Jesse Houwing
jesse.houwing at sogeti.n


 
Reply With Quote
 
barry
Guest
Posts: n/a
 
      2nd May 2008

well i work on freelancer sites and one buyer had posted 3 xml files which
hr/she could not read in IE, i tried them myself it would fail on some lines
with IE giving the following error message

A semi colon character was expected. Error processing resource
'file:///C:/3Xmls/canales.es_9159529468.xml'. Line 16590, P...

once the & was replacedwith &amp; it would move further and show a error on
another line.

The buyer wanted the errors corrected in the entire files, it was possible
to do a find/replace (carefully) in a text editor, i have no intention of
hacking and do not have the time to do so.

If you want i can send you one of those files (i do not have the permission
to do so, but that does not matter).
following is one such problem node, link is the problem node

<video>
<idvideo>Publicidad</idvideo>
<nombre>Publicidad</nombre>
<descripcion>Publicidad</descripcion>
<url>http://www.xxxxxxxxx.tv/xxx/redir.php?pf=zoneid__18;n__ae371c90;cb__786592291</url> <link>http://www.xxxxxxxxx.tv/ads/redir.php?clk=1&pf=zoneid__18;n__ae371c90;cb__786592291</link> <category>preroll</category> <thumbnail></thumbnail></video>"Tigger" <(E-Mail Removed)> wrote in messagenews(E-Mail Removed)...> "barry" <(E-Mail Removed)> wrote in messagenews:%(E-Mail Removed)...>>>> Thanks for your reply>>>> Imagine the following string>> string str = "The Quick Black&Fox &amp; Jumped Over &quote; The & LazyDog";>>>> should be>>>> string str = "The Quick Black&amp;Fox &amp; Jumped Over &quote; The &Lazy Dog";>>>> This is a problem with a larger .xml file in which xx&xx is creating aproblems in IE>>>> In fact in have just spent over 50 minutes and managed to get someresults like this>>>> str = Regex.Replace(str, @"\b\s*(?=&[^&amp;|&quote;| & ])\b", "&amp;",RegexOptions.None);>>>> And last but not the least i collect all answers posted to my Regexqueries for later use.>>>> Is this a case of correcting badly encoded data? Is the source dataexpected to be correctly encoded html/xml?>> It seems encoding certain "&"s while igonoring others is hacking around aproblem instead of sorting out why the source data is incorrect.>> Also, in your example you encode one "&" at "Black&Fox" while ignoringanother at "The & Lazy". So what are the rules?>> --> Tigger> http://www.mccreath.org.uk>

 
Reply With Quote
 
barry
Guest
Posts: n/a
 
      2nd May 2008
Hello Jesse

Will this work on a entire XML file (it has over 20,000 lines) and there are
many lines with such problems. The acutal job is long over, i am only trying
to understand regex in such cases.

Barry


"Jesse Houwing" <(E-Mail Removed)> wrote in message
news:(E-Mail Removed)...
> Hello Barry,
>
> Let's try to wriete out the pattern you're looking for.
>
> You're specifically looking for a pattern that consists of an &, not
> followed by any alphanumeric characters and a ;. Now if you write it out
> like that, it becomes quite simple:
>
> &(?![a-z0-9]+
>
> That's it... Now, replace those with &amp; and you're done.
>
> Jesse
>
>
>> Thanks for your reply
>>
>> Imagine the following string
>> string str = "The Quick Black&Fox &amp; Jumped Over &quote; The & Lazy
>> Dog";
>> should be
>>
>> string str = "The Quick Black&amp;Fox &amp; Jumped Over &quote; The &
>> Lazy Dog";
>>
>> This is a problem with a larger .xml file in which xx&xx is creating
>> a problems in IE
>>
>> In fact in have just spent over 50 minutes and managed to get some
>> results like this
>>
>> str = Regex.Replace(str, @"\b\s*(?=&[^&amp;|&quote;| & ])\b", "&amp;",
>> RegexOptions.None);
>>
>> And last but not the least i collect all answers posted to my Regex
>> queries for later use.
>>
>> "eBob.com" <(E-Mail Removed)> wrote in message
>> news:(E-Mail Removed)...
>>
>>> Hi Barry,
>>>
>>> Actually I wanted to play around with this for you but just haven't
>>> gotten around to it. (And, btw, I don't recall any previous posts
>>> from you on this subject.)
>>>
>>> Part of my response to you, the part I can provide without actually
>>> playing with a regex, is the following. Regular Expressions are
>>> extremely useful. If you do any programming the effort you put into
>>> learning regular expressions will be worth it. Several of us here
>>> use Expresso (from UltraPico) and recommend it. I just became aware
>>> of something similar called Regular Expression Workbench available
>>> from MSDN. I've installed it but have not yet played with it.
>>>
>>> I'll try to play with it later today but no promises.
>>>
>>> Bob
>>>
>>> "barry" <(E-Mail Removed)> wrote in message
>>> news:(E-Mail Removed)...
>>>
>>>> strange, no one has replied
>>>>
>>>> looks like i have crossed the limit of asking question on the same
>>>> topic. I the some limit maybe (10 or 15 per topic)
>>>>
>>>> "barry" <(E-Mail Removed)> wrote in message
>>>> news:(E-Mail Removed)...
>>>>
>>>>> Hi
>>>>>
>>>>> I have a files which contains
>>>>> &
>>>>> &amp;
>>>>> &quote;
>>>>> I want to replace & with &amp; , but not &amp; or &quote;
>>>>>
>>>>> Will someone please help with the Regular Expression.
>>>>>
>>>>> TIA
>>>>> Barry

> --
> Jesse Houwing
> jesse.houwing at sogeti.nl
>
>



 
Reply With Quote
 
barry
Guest
Posts: n/a
 
      2nd May 2008
Thanks Jesse

It does the replace in the entire xml file.

Barry


"barry" <(E-Mail Removed)> wrote in message
news:%(E-Mail Removed)...
> Hello Jesse
>
> Will this work on a entire XML file (it has over 20,000 lines) and there
> are many lines with such problems. The acutal job is long over, i am only
> trying to understand regex in such cases.
>
> Barry
>
>
> "Jesse Houwing" <(E-Mail Removed)> wrote in message
> news:(E-Mail Removed)...
>> Hello Barry,
>>
>> Let's try to wriete out the pattern you're looking for.
>>
>> You're specifically looking for a pattern that consists of an &, not
>> followed by any alphanumeric characters and a ;. Now if you write it out
>> like that, it becomes quite simple:
>>
>> &(?![a-z0-9]+
>>
>> That's it... Now, replace those with &amp; and you're done.
>>
>> Jesse
>>
>>
>>> Thanks for your reply
>>>
>>> Imagine the following string
>>> string str = "The Quick Black&Fox &amp; Jumped Over &quote; The & Lazy
>>> Dog";
>>> should be
>>>
>>> string str = "The Quick Black&amp;Fox &amp; Jumped Over &quote; The &
>>> Lazy Dog";
>>>
>>> This is a problem with a larger .xml file in which xx&xx is creating
>>> a problems in IE
>>>
>>> In fact in have just spent over 50 minutes and managed to get some
>>> results like this
>>>
>>> str = Regex.Replace(str, @"\b\s*(?=&[^&amp;|&quote;| & ])\b", "&amp;",
>>> RegexOptions.None);
>>>
>>> And last but not the least i collect all answers posted to my Regex
>>> queries for later use.
>>>
>>> "eBob.com" <(E-Mail Removed)> wrote in message
>>> news:(E-Mail Removed)...
>>>
>>>> Hi Barry,
>>>>
>>>> Actually I wanted to play around with this for you but just haven't
>>>> gotten around to it. (And, btw, I don't recall any previous posts
>>>> from you on this subject.)
>>>>
>>>> Part of my response to you, the part I can provide without actually
>>>> playing with a regex, is the following. Regular Expressions are
>>>> extremely useful. If you do any programming the effort you put into
>>>> learning regular expressions will be worth it. Several of us here
>>>> use Expresso (from UltraPico) and recommend it. I just became aware
>>>> of something similar called Regular Expression Workbench available
>>>> from MSDN. I've installed it but have not yet played with it.
>>>>
>>>> I'll try to play with it later today but no promises.
>>>>
>>>> Bob
>>>>
>>>> "barry" <(E-Mail Removed)> wrote in message
>>>> news:(E-Mail Removed)...
>>>>
>>>>> strange, no one has replied
>>>>>
>>>>> looks like i have crossed the limit of asking question on the same
>>>>> topic. I the some limit maybe (10 or 15 per topic)
>>>>>
>>>>> "barry" <(E-Mail Removed)> wrote in message
>>>>> news:(E-Mail Removed)...
>>>>>
>>>>>> Hi
>>>>>>
>>>>>> I have a files which contains
>>>>>> &
>>>>>> &amp;
>>>>>> &quote;
>>>>>> I want to replace & with &amp; , but not &amp; or &quote;
>>>>>>
>>>>>> Will someone please help with the Regular Expression.
>>>>>>
>>>>>> TIA
>>>>>> Barry

>> --
>> Jesse Houwing
>> jesse.houwing at sogeti.nl
>>
>>

>
>



 
Reply With Quote
 
Jesse Houwing
Guest
Posts: n/a
 
      2nd May 2008
Hello Barry,

> Thanks Jesse
>
> It does the replace in the entire xml file.
>


You're welcome

Jesse


> Barry
>
> "barry" <(E-Mail Removed)> wrote in message
> news:%(E-Mail Removed)...
>
>> Hello Jesse
>>
>> Will this work on a entire XML file (it has over 20,000 lines) and
>> there are many lines with such problems. The acutal job is long over,
>> i am only trying to understand regex in such cases.
>>
>> Barry
>>
>> "Jesse Houwing" <(E-Mail Removed)> wrote in message
>> news:(E-Mail Removed)...
>>
>>> Hello Barry,
>>>
>>> Let's try to wriete out the pattern you're looking for.
>>>
>>> You're specifically looking for a pattern that consists of an &, not
>>> followed by any alphanumeric characters and a ;. Now if you write it
>>> out like that, it becomes quite simple:
>>>
>>> &(?![a-z0-9]+
>>>
>>> That's it... Now, replace those with &amp; and you're done.
>>>
>>> Jesse
>>>
>>>> Thanks for your reply
>>>>
>>>> Imagine the following string
>>>> string str = "The Quick Black&Fox &amp; Jumped Over &quote; The &
>>>> Lazy
>>>> Dog";
>>>> should be
>>>> string str = "The Quick Black&amp;Fox &amp; Jumped Over &quote; The
>>>> & Lazy Dog";
>>>>
>>>> This is a problem with a larger .xml file in which xx&xx is
>>>> creating a problems in IE
>>>>
>>>> In fact in have just spent over 50 minutes and managed to get some
>>>> results like this
>>>>
>>>> str = Regex.Replace(str, @"\b\s*(?=&[^&amp;|&quote;| & ])\b",
>>>> "&amp;", RegexOptions.None);
>>>>
>>>> And last but not the least i collect all answers posted to my Regex
>>>> queries for later use.
>>>>
>>>> "eBob.com" <(E-Mail Removed)> wrote in message
>>>> news:(E-Mail Removed)...
>>>>
>>>>> Hi Barry,
>>>>>
>>>>> Actually I wanted to play around with this for you but just
>>>>> haven't gotten around to it. (And, btw, I don't recall any
>>>>> previous posts from you on this subject.)
>>>>>
>>>>> Part of my response to you, the part I can provide without
>>>>> actually playing with a regex, is the following. Regular
>>>>> Expressions are extremely useful. If you do any programming the
>>>>> effort you put into learning regular expressions will be worth it.
>>>>> Several of us here use Expresso (from UltraPico) and recommend it.
>>>>> I just became aware of something similar called Regular Expression
>>>>> Workbench available from MSDN. I've installed it but have not yet
>>>>> played with it.
>>>>>
>>>>> I'll try to play with it later today but no promises.
>>>>>
>>>>> Bob
>>>>>
>>>>> "barry" <(E-Mail Removed)> wrote in message
>>>>> news:(E-Mail Removed)...
>>>>>> strange, no one has replied
>>>>>>
>>>>>> looks like i have crossed the limit of asking question on the
>>>>>> same topic. I the some limit maybe (10 or 15 per topic)
>>>>>>
>>>>>> "barry" <(E-Mail Removed)> wrote in message
>>>>>> news:(E-Mail Removed)...
>>>>>>> Hi
>>>>>>>
>>>>>>> I have a files which contains
>>>>>>> &
>>>>>>> &amp;
>>>>>>> &quote;
>>>>>>> I want to replace & with &amp; , but not &amp; or &quote;
>>>>>>> Will someone please help with the Regular Expression.
>>>>>>>
>>>>>>> TIA
>>>>>>> Barry
>>> --
>>> Jesse Houwing
>>> jesse.houwing at sogeti.nl

--
Jesse Houwing
jesse.houwing at sogeti.nl


 
Reply With Quote
 
 
 
Reply

Thread Tools
Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Use regex.replace to replace a group #, not the whole match string justin.mayes@gmail.com Microsoft VB .NET 1 2nd Jun 2006 11:07 PM
Cannot replace parenthesis using regex.replace? =?Utf-8?B?U3RlcGhhbmU=?= Microsoft Dot NET 16 13th Oct 2005 07:17 PM
RegEx.Replace replace more than one kids_pro Microsoft C# .NET 3 10th Aug 2004 09:19 AM
Performance Regex Replace vs StringBuilder Replace Jay B. Harlow [MVP - Outlook] Microsoft VB .NET 8 22nd Mar 2004 02:08 AM
Replace methode, Replace Function, Stringbuilder replace, Regex Replace, Split Cor Microsoft VB .NET 4 1st Mar 2004 03:50 PM


Features
 

Advertising
 

Newsgroups
 


All times are GMT +1. The time now is 11:09 AM.