Regex Question

A

AMP

Hello,
I am coming back to a project and I dont remember what the following
Regex says
I do know it removes all \r\n from the string, but I dont see how.
Can someone explain this one?

Regex re = new Regex(@"([\x00-\x1F\x7E-\xFF]+)",
RegexOptions.Compiled);
string op = re.Replace(FileToParse, "");

Thanks
Mike
 
G

Gilles Kohl [MVP]

Hello,
I am coming back to a project and I dont remember what the following
Regex says
I do know it removes all \r\n from the string, but I dont see how.
Can someone explain this one?

Regex re = new Regex(@"([\x00-\x1F\x7E-\xFF]+)",
RegexOptions.Compiled);
string op = re.Replace(FileToParse, "");

How it works? The outer parentheses are redundant IMHO. The regex
boils down to a positive character group with two ranges, the start
and end of which (respectively) being expressed as hexadecimal
escapes: \x00-\x1F (0 to 31 in decimal) and \x7E-\xFF (126 to 255 in
decimal). With the appended "+", it basically means "one or more
characters between 0-31 resp. 126-255".

Replacing all these occurences with nothing (empty string) does far
more than just remove \r and \n - it removes all characters in the
range 0-31 and 126-255. The intention is probably to kill anything
that is not in the "ASCII" range. Unfortunately, it also kills the
tilde "~" (126).

It will also remove e.g. accents and umlaut characters in the range
128-256. What it will NOT remove are Unicode characters from 256
upwards.

Try e.g.

string originalString = "Testing <\u00e7> <\u0107> ";

Regex re = new Regex(@"([\x00-\x1F\x7E-\xFF]+)",
RegexOptions.Compiled);
string replacedString = re.Replace(originalString, "");

MessageBox.Show(originalString);
MessageBox.Show(replacedString);

The first "special" character, a lowercase C with cedilla, will be
removed. The second one, a lowercase c with acute accent, will not be
affected.

(My suggestion, if your intention is to remove anything not in the
range 32-126, would be to use this:

Regex re = new Regex(@"[^\x20-\x7E]+", RegexOptions.Compiled);

instead.)

Regards,
Gilles.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads


Top