remove double space from string

M

mp

I posted a ques with this subject an hour ago and it hasn't shown up yet(in
my newsreader), so apologies if it eventually does appear(or has appeared on
your end).
I wrote this function to remove double spaces(and some other unwanted
whitespace) from an input string.
is there a better way?
thanks
mark

private string RemoveExcessWhitespace(string inputString)
{
StringBuilder sb = new StringBuilder();
sb.Append(inputString);
int lastLength = 0;
int thisLength = 0;
do
{
lastLength = sb.Length;
sb.Replace('\t', ' ');
sb.Replace(" ", " ");
sb.Replace(") ", ")");
sb.Replace(" (", "(");
thisLength = sb.Length;
} while (thisLength != lastLength);

return sb.ToString();
}
 
L

Luuk

I posted a ques with this subject an hour ago and it hasn't shown up yet(in
my newsreader), so apologies if it eventually does appear(or has appeared on
your end).
I wrote this function to remove double spaces(and some other unwanted
whitespace) from an input string.
is there a better way?
thanks
mark

private string RemoveExcessWhitespace(string inputString)
{
StringBuilder sb = new StringBuilder();
sb.Append(inputString);
int lastLength = 0;
int thisLength = 0;
do
{
lastLength = sb.Length;
sb.Replace('\t', ' ');

i dont think the length will change after above replacement?
 
L

Luuk

I posted a ques with this subject an hour ago and it hasn't shown up yet(in
my newsreader), so apologies if it eventually does appear(or has appeared on
your end).
I wrote this function to remove double spaces(and some other unwanted
whitespace) from an input string.
is there a better way?
thanks
mark

private string RemoveExcessWhitespace(string inputString)
{
StringBuilder sb = new StringBuilder();
sb.Append(inputString);
int lastLength = 0;
int thisLength = 0;
do
{
lastLength = sb.Length;
sb.Replace('\t', ' ');
sb.Replace(" ", " ");
sb.Replace(") ", ")");
sb.Replace(" (", "(");
thisLength = sb.Length;
} while (thisLength != lastLength);

return sb.ToString();
}

and:
http://oreilly.com/windows/archive/csharp-regular-expressions.html
shows:
Removing Leading and Trailing Whitespace
string t9a = " leading";
string p9a = @"^\s+";
string r9a = Regex.Replace(t9a, p9a, "");

string t9b = "trailing ";
string p9b = @"\s+$";



so, maybe if you change that a bit, your could use Regex.replace too ?
 
M

mp

Luuk said:
I posted a ques with this subject an hour ago and it hasn't shown up
yet(in [...]
do
{
lastLength = sb.Length;
sb.Replace('\t', ' ');

i dont think the length will change after above replacement?

Luuk

thanks, good point.,..that could be moved out of the loop
 
M

mp

Luuk said:
I posted a ques with this subject an hour ago and it hasn't shown up
yet(in
[...]>
and:
http://oreilly.com/windows/archive/csharp-regular-expressions.html
shows:
Removing Leading and Trailing Whitespace
string t9a = " leading";
string p9a = @"^\s+";
string r9a = Regex.Replace(t9a, p9a, "");

string t9b = "trailing ";
string p9b = @"\s+$";



so, maybe if you change that a bit, your could use Regex.replace too ?

thanks, but leading and trailing are handled easily with .Trim
there is probably a way to get regex to remove intermediate spaces as well,
but I'm not good at regex.
mark
 
A

Arne Vajhøj

I posted a ques with this subject an hour ago and it hasn't shown up yet(in
my newsreader), so apologies if it eventually does appear(or has appeared on
your end).
I wrote this function to remove double spaces(and some other unwanted
whitespace) from an input string.
is there a better way?
thanks

private string RemoveExcessWhitespace(string inputString)
{
StringBuilder sb = new StringBuilder();
sb.Append(inputString);
int lastLength = 0;
int thisLength = 0;
do
{
lastLength = sb.Length;
sb.Replace('\t', ' ');
sb.Replace(" ", " ");
sb.Replace(") ", ")");
sb.Replace(" (", "(");
thisLength = sb.Length;
} while (thisLength != lastLength);

return sb.ToString();
}

With the new requirements, then I will suggest either
you do 3 simple Replace and 1 RegexReplace as shown
in last thread or iterate through the string and
act according to previous and current character (accumulate
in a StringBuilder as you already do). It is a litte bit more
code, but when you keep adding new rules, then the performance
improvements could be worth it.

Arne
 
L

Luuk

Luuk said:
I posted a ques with this subject an hour ago and it hasn't shown up
yet(in
[...]>
and:
http://oreilly.com/windows/archive/csharp-regular-expressions.html
shows:
Removing Leading and Trailing Whitespace
string t9a = " leading";
string p9a = @"^\s+";
string r9a = Regex.Replace(t9a, p9a, "");

string t9b = "trailing ";
string p9b = @"\s+$";



so, maybe if you change that a bit, your could use Regex.replace too ?

thanks, but leading and trailing are handled easily with .Trim
there is probably a way to get regex to remove intermediate spaces as well,
but I'm not good at regex.
mark

ok, my csharp is not good ;-)

But a simple application with 2 Textbox's, when changing the 1st, the
text with stripped spaces will be show in the second.

private void textBox1_TextChanged(object sender, EventArgs e)
{
string source = this.textBox1.Text;
string rep = @"\s+";
this.textBox2.Text = Regex.Replace(source, rep, " ");
}


It does need some improvements, because multiple spaces at start, or
end, of string also get replaced with 1 space, leaving this 1 space at
start, or end, of the string.
 
M

mp

thanks to everyone for their ideas and help
mark

Luuk said:
[...]
private void textBox1_TextChanged(object sender, EventArgs e)
{
string source = this.textBox1.Text;
string rep = @"\s+";
this.textBox2.Text = Regex.Replace(source, rep, " ");
}


It does need some improvements, because multiple spaces at start, or end,
of string also get replaced with 1 space, leaving this 1 space at start,
or end, of the string.
 
A

Arne Vajhøj

But a simple application with 2 Textbox's, when changing the 1st, the
text with stripped spaces will be show in the second.

private void textBox1_TextChanged(object sender, EventArgs e)
{
string source = this.textBox1.Text;
string rep = @"\s+";
this.textBox2.Text = Regex.Replace(source, rep, " ");
}


It does need some improvements, because multiple spaces at start, or
end, of string also get replaced with 1 space, leaving this 1 space at
start, or end, of the string.

That is just a simple Trim.

And OP may want to use something else than \s because the
set of [ \t\n\x0B\f\r] may not be what is needed.

But using the basic + is a lot more elegant than the
negative lookbehind I suggested.

Arne
 
M

mp

Arne Vajhøj said:
I posted a ques with this subject an hour ago and it hasn't shown up
yet(in
[]
With the new requirements, then I will suggest either
you do 3 simple Replace and 1 RegexReplace as shown
in last thread or iterate through the string and
act according to previous and current character (accumulate
in a StringBuilder as you already do). It is a litte bit more
code, but when you keep adding new rules, then the performance
improvements could be worth it.

Arne

i'll try to work up some timing tests to compare regex with stepping through
characters etc
i see now that several of my tests don't need to be in loop, just simple
..Replace
thanks
mark
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top