StringBuilder Remove function - How to remove single character?

  • Thread starter Thread starter deko
  • Start date Start date
D

deko

I need to loop through a string and remove all characters except numbers or
letters. I am getting an ArgumentOutOfRangeException: "Index was out of
range. Must be non-negative and less than the size of the collection"

Not sure what is going on... any suggestions welcome!

public static string fixStr(string aStr)
{
StringBuilder bStr = new StringBuilder(aStr.Length);
for (int i = 0; i < aStr.Length; i++)
{
if (!Char.IsDigit(aStr) | !Char.IsLetter(aStr))
{
bStr.Remove(i,1);
}
}
return bStr.ToString();
}
 
deko said:
I need to loop through a string and remove all characters except numbers or
letters. I am getting an ArgumentOutOfRangeException: "Index was out of
range. Must be non-negative and less than the size of the collection"

Not sure what is going on... any suggestions welcome!

public static string fixStr(string aStr)
{
StringBuilder bStr = new StringBuilder(aStr.Length);
for (int i = 0; i < aStr.Length; i++)
{
if (!Char.IsDigit(aStr) | !Char.IsLetter(aStr))
{
bStr.Remove(i,1);
}
}
return bStr.ToString();
}


Your StringBuilder is empty. You might as well just append the numbers and
letters

public static string fixStr(string aStr)
{
StringBuilder bStr = new StringBuilder(aStr.Length);
for (int i = 0; i < aStr.Length; i++)
{
char c = aStr;
if (Char.IsDigit(c) | Char.IsLetter(c))
{
bStr.Append(c);
}
}
return bStr.ToString();
}

David
 
Deko,
In addition to David's comments I would pass the string itself to
StringBuilder's constructor instead of the length. This will initialize the
StringBuilder to the value of the string, instead of just setting the
capacity to the length of the string.

Something like:
public static string fixStr(string aStr)
{

StringBuilder bStr = new StringBuilder(aStr);
for (int i = 0; i < aStr.Length; i++)
{
if (!Char.IsDigit(aStr) | !Char.IsLetter(aStr))
{
bStr.Remove(i,1);
}
}
return bStr.ToString();
}


Hope this helps
Jay

deko said:
I need to loop through a string and remove all characters except numbers or
letters. I am getting an ArgumentOutOfRangeException: "Index was out of
range. Must be non-negative and less than the size of the collection"

Not sure what is going on... any suggestions welcome!

public static string fixStr(string aStr)
{
StringBuilder bStr = new StringBuilder(aStr.Length);
for (int i = 0; i < aStr.Length; i++)
{
if (!Char.IsDigit(aStr) | !Char.IsLetter(aStr))
{
bStr.Remove(i,1);
}
}
return bStr.ToString();
}
 
Jay B. Harlow said:
Deko,
In addition to David's comments I would pass the string itself to
StringBuilder's constructor instead of the length. This will initialize
the StringBuilder to the value of the string, instead of just setting the
capacity to the length of the string.

But StringBuilder.Remove has to copy all the subsequent char's over the
removed ones, while StringBuilder.Append(char) with a pre-allocated
StringBuilder only has to set the value of an existing char and increment
the length.

David
 
David,
But StringBuilder.Remove has to copy all the subsequent char's over the

:-)

I meant to mention that...

Either way you are "copying characters", if there is a lot of bad characters
I would expect removing them to perform badly as you are repeatedly moving
the ones you are keeping. If there are significantly more good characters
then appending them may not be any better as you are still copying the
entire string.

I suspect if the bad characters are at the front end or tail end makes a
difference also...

The only real way to decide if either method is better, or even using a
RegEx or other algorithm is better would be to profile the routines using
data specific to Deko's application.

Lastly the 80/20 rule may say it doesn't make a difference which method is
faster as this routine is not part of the 20% where the app is spending all
the time.

Hope this helps
Jay
 
In addition to David's comments I would pass the string itself to
But StringBuilder.Remove has to copy all the subsequent char's over the
removed ones, while StringBuilder.Append(char) with a pre-allocated
StringBuilder only has to set the value of an existing char and increment
the length.


Outstanding. This dilogue has really helped.

What I don't understand, however, is how StringBuilder.Append replaces the
bad characters.

What I'm trying to do is take directory names (often long) and make them
into "safe" strings (by limiting them to letters and numbers) that will be
stored in an XML file - from which datasets with constraints and relations
will be built.

I came up with the below code to try to keep some resemblance to the
original directory name. Would this also avoid the additional processing
incurred by "copying the subsequent char's over the removed ones" - since I
am only replacing the char?

public static string fixStr(string aStr)
{
StringBuilder bStr = new StringBuilder(aStr);
for (int i = 0; i < bStr.Length; i++)
{
if (Char.IsDigit(bStr) | Char.IsLetter(bStr)) //for some reason
I could not get "!" to work - perhaps it's just me..)
{
}
else
{
bStr.Replace(bStr.ToString(), "_");
}

}
return bStr.ToString();
}
 
I need to loop through a string and remove all characters except
numbers or letters. I am getting an
ArgumentOutOfRangeException: "Index was out of range. Must be
non-negative and less than the size of the collection"

Not sure what is going on... any suggestions welcome!

public static string fixStr(string aStr)
{
StringBuilder bStr = new StringBuilder(aStr.Length);
for (int i = 0; i < aStr.Length; i++)
{
if (!Char.IsDigit(aStr) | !Char.IsLetter(aStr))
{
bStr.Remove(i,1);
}
}
return bStr.ToString();
}


deko,

You can use a regular expression to do this in one line of code:

using System.Text.RegularExpressions;

....

public static string fixStr(string aStr)
{
// \p{L} = Unicode class for anything considered a letter.
// \p{N} = Unicode class for anything considered a number.

return Regex.Replace(aStr, @"[^\p{L}\p{N}]", string.Empty);
}
 
deko said:
Outstanding. This dilogue has really helped.

What I don't understand, however, is how StringBuilder.Append replaces the
bad characters.

It doesn't replace them - you just never put them in the StringBuilder
in the first place.
What I'm trying to do is take directory names (often long) and make them
into "safe" strings (by limiting them to letters and numbers) that will be
stored in an XML file - from which datasets with constraints and relations
will be built.

I came up with the below code to try to keep some resemblance to the
original directory name. Would this also avoid the additional processing
incurred by "copying the subsequent char's over the removed ones" - since I
am only replacing the char?

public static string fixStr(string aStr)
{
StringBuilder bStr = new StringBuilder(aStr);
for (int i = 0; i < bStr.Length; i++)
{
if (Char.IsDigit(bStr) | Char.IsLetter(bStr)) //for some reason
I could not get "!" to work - perhaps it's just me..)
{
}


! should work fine - but remember you need to change it to

if (!Char.IsDigit(bStr) && !Char.IsLetter(bStr))
{
....
}

ie !(A||B) = (!A && !B)
 
Deko,
What I don't understand, however, is how StringBuilder.Append replaces the
bad characters.
As Jon stated it doesn't.

Look closely at David's routine, he starts with an empty StringBuilder then
only appends the characters you want to keep.

Look closely at my variation of your original routine, I start with a full
StringBuilder then remove the characters you want to exclude.

Hope this helps
Jay



deko said:
But StringBuilder.Remove has to copy all the subsequent char's over the
removed ones, while StringBuilder.Append(char) with a pre-allocated
StringBuilder only has to set the value of an existing char and increment
the length.


Outstanding. This dilogue has really helped.

What I don't understand, however, is how StringBuilder.Append replaces the
bad characters.

What I'm trying to do is take directory names (often long) and make them
into "safe" strings (by limiting them to letters and numbers) that will be
stored in an XML file - from which datasets with constraints and relations
will be built.

I came up with the below code to try to keep some resemblance to the
original directory name. Would this also avoid the additional processing
incurred by "copying the subsequent char's over the removed ones" - since
I
am only replacing the char?

public static string fixStr(string aStr)
{
StringBuilder bStr = new StringBuilder(aStr);
for (int i = 0; i < bStr.Length; i++)
{
if (Char.IsDigit(bStr) | Char.IsLetter(bStr)) //for some reason
I could not get "!" to work - perhaps it's just me..)
{
}
else
{
bStr.Replace(bStr.ToString(), "_");
}

}
return bStr.ToString();
}
 
Look closely at David's routine, he starts with an empty StringBuilder
then
only appends the characters you want to keep.

Look closely at my variation of your original routine, I start with a full
StringBuilder then remove the characters you want to exclude.

Yes, now I see... thanks.
 
! should work fine - but remember you need to change it to
if (!Char.IsDigit(bStr) && !Char.IsLetter(bStr))
{
...
}

ie !(A||B) = (!A && !B)


that did the trick - thanks.
 
public static string fixStr(string aStr)
{
// \p{L} = Unicode class for anything considered a letter.
// \p{N} = Unicode class for anything considered a number.

return Regex.Replace(aStr, @"[^\p{L}\p{N}]", string.Empty);
}

10-4. That works. Thanks.
 
Back
Top