remove double space from string

M

mp

Arne Vajhøj said:
surprisingly(to me) your (elegant) for loop is winner hands down!

Often there is a trade off between speed and maintainability.

The Regex solutions are a lot easier to extend to add
more functionality.

The hand coded solution will soon be buried in an
unreadable mess of if statements.
if (s != ' ' || s[i - 1] != ' ')
{
res.Append(s);
}
now that's a thing of beauty

makes me laugh at my clumsy version :)

currentCharacter = inputString[CharPos];
if(currentCharacter ==' ')
{
if (preceedingCharacter != ' ')
returnString.Append(currentCharacter);
}
else
returnString.Append(currentCharacter);
preceedingCharacter = currentCharacter;

must be why your forloop is so much faster than mine


I would not expect that big a difference between those
two versions in speed. There are no big difference
in what they do - it is just how the code is written.

Arne


if i plug my version into your test framework there's not a big
difference...yours is a lttle faster
however in my test framework the regex solutions were faster than the loops
in yours, the loops were faster than the regex

some differences were my input strings were different and timing method a
little different
i'm trying to make a common test to compare apples to apples

one question in the meantime
in your test
public class WhileString : MultiSpaceTrimmerTest
{
public string Trim(string s)
{
string res = s;
int len;
do
{
len = res.Length;
res = res.Replace(" ", " ");
}
while (res.Length < len);
return res;
}
since string is immutable the line [res = res.Replace(" ", " ");]
creates a new string each time, right?
would it be "better" to use a stringBuilder in a case like this?
thanks
mark
 
A

Arne Vajhøj

one question in the meantime
in your test
public class WhileString : MultiSpaceTrimmerTest
{
public string Trim(string s)
{
string res = s;
int len;
do
{
len = res.Length;
res = res.Replace(" ", " ");
}
while (res.Length< len);
return res;
}
since string is immutable the line [res = res.Replace(" ", " ");]
creates a new string each time, right?
would it be "better" to use a stringBuilder in a case like this?

I test with both String and StringBuilder.

The String version does create a lot of new objects, but
but you need a lot of those before it really matters. Which
the results should show.

Arne
 
M

mp

Arne Vajhøj said:
one question in the meantime
in your test
public class WhileString : MultiSpaceTrimmerTest
{
public string Trim(string s)
{
string res = s;
int len;
do
{
len = res.Length;
res = res.Replace(" ", " ");
}
while (res.Length< len);
return res;
}
since string is immutable the line [res = res.Replace(" ", " ");]
creates a new string each time, right?
would it be "better" to use a stringBuilder in a case like this?

I test with both String and StringBuilder.

duh!
of course you did...i forgot that... i was just quickly looking at it again
and I forgot about your stringBuilder version at that moment...like I
mentioned earlier...creeping senility :-{
 
M

mp

Arne Vajhøj said:
one question in the meantime
[]

I test with both String and StringBuilder.

The String version does create a lot of new objects, but
but you need a lot of those before it really matters. Which
the results should show.

Arne

Hi Arne,
if i may impose on your generosity one more time...
I'm trying to understand the logic of your using the constant N to set the
number of tests and adjust the timing report, dependent on the length of the
string....
for (int i = 0; i < N / s.Length; i++)
{
mstt.Trim(s);
}
long t2 = DateTime.Now.Ticks;
Console.WriteLine(" {0} : {1}", mstt.Name, s.Length * (t2 -t1) / N);

I can guess at the part--- for (int i = 0; i < N / s.Length; i++)
i see that way the longer the string the fewer repetitions of the test
i'm not understanding why you're modifying the timing report also based on
string length and the constant
s.Length * (t2 -t1) / N

i would have thought the actual time consumed was simply (t2 -t1)???
thanks
mark
 
A

Arne Vajhøj

Arne Vajhøj said:
one question in the meantime
[]

I test with both String and StringBuilder.

The String version does create a lot of new objects, but
but you need a lot of those before it really matters. Which
the results should show.
if i may impose on your generosity one more time...
I'm trying to understand the logic of your using the constant N to set the
number of tests and adjust the timing report, dependent on the length of the
string....
for (int i = 0; i< N / s.Length; i++)
{
mstt.Trim(s);
}
long t2 = DateTime.Now.Ticks;
Console.WriteLine(" {0} : {1}", mstt.Name, s.Length * (t2 -t1) / N);

I can guess at the part--- for (int i = 0; i< N / s.Length; i++)
i see that way the longer the string the fewer repetitions of the test

Correct.

To have the tests take approx. the same time instead of long strings
taking a lot longer time than short strings.
i'm not understanding why you're modifying the timing report also based on
string length and the constant
s.Length * (t2 -t1) / N

i would have thought the actual time consumed was simply (t2 -t1)???

(t2-t1) is the time for N / s.Length calls
s.Length * (t2 -t1) / N is the time for 1 call

None of this is particular relevant for the core problem just
small conveniences to make the test run faster and make
the numbers easy to compare.

Arne
 
M

mp

Arne Vajhøj said:
Arne Vajhøj said:
On 06-12-2010 20:16, mp wrote:
one question in the meantime
[...]

i would have thought the actual time consumed was simply (t2 -t1)???

(t2-t1) is the time for N / s.Length calls
s.Length * (t2 -t1) / N is the time for 1 call

it's so obvious once you say it! :)
None of this is particular relevant for the core problem just
small conveniences to make the test run faster and make
the numbers easy to compare.

Arne

thanks,
I was trying to understand why your timings show such different conclusions
from mine, so that's why i was inquiring
in my tests, regex was a lot faster than loops , in yours loop was a lot
faster than regex
i'm trying to figure out what i'm doing wrong to get such a different
conclusion

so i'm trying to make a hybrid test that uses your class/interface structure
but uses my test string and reports in actual milliseconds
to that end i'm working on adapting your example and turning it into a class
that my existing project can use to run the tests

then i can start testing on actual files of data to see how real world
results will look like before finalizing which routine is the winner as far
as speed of processing.

i really like the way you made your tests classes implementing an interface
for ease of extending different tests
thanks
mark
 
A

Arne Vajhøj

I was trying to understand why your timings show such different conclusions
from mine, so that's why i was inquiring
in my tests, regex was a lot faster than loops , in yours loop was a lot
faster than regex
i'm trying to figure out what i'm doing wrong to get such a different
conclusion

so i'm trying to make a hybrid test that uses your class/interface structure
but uses my test string and reports in actual milliseconds
to that end i'm working on adapting your example and turning it into a class
that my existing project can use to run the tests

then i can start testing on actual files of data to see how real world
results will look like before finalizing which routine is the winner as far
as speed of processing.

i really like the way you made your tests classes implementing an interface
for ease of extending different tests

One possible explanation is that I only vary the length of the
total string not the number of consecutive spaces.

Regex may be faster than the while loops in some cases when
there are a lot of consecutive spaces, because regex is O(1)
and while loops are O(log(n)) in number of consecutive spaces.

I can not imagine regex being faster than the for loop. But
remember to build for release to see best performance of the for
loop.

Arne
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top