Does String.Replace always recreate a new string object?

  • Thread starter Thread starter Sin Jeong-hun
  • Start date Start date
S

Sin Jeong-hun

If I use something like this,

string html = "<h1>C# is great</h1>";
Console.WriteLine(html.Replace("&lt;","<").Replace("&gt;",">"));

Does this recreate new string objects two times, even though it has
nothing to replace, and it is OK to return the same string object?

I know, regular expression is a more delicate way to do this, but it
always is kind of too complicated to me.
Thank you.
 
Sin Jeong-hun said:
If I use something like this,

string html = "<h1>C# is great</h1>";
Console.WriteLine(html.Replace("&lt;","<").Replace("&gt;",">"));

Does this recreate new string objects two times, even though it has
nothing to replace, and it is OK to return the same string object?

I know, regular expression is a more delicate way to do this, but it
always is kind of too complicated to me.
Thank you.


Hi Jeong.
The clue is in your title...

"Does String <snip> always recreate a new string object?"

Yes. While strings are a reference-type, they behave like a value-type.

Strings are immutable.

While the JITter and runtime may treat strings different, and optimize, the
specs (the interface) says that all strings are always copied.

public void example(string a, string b)
{
string c = a + b; // c is allocated and 'a' and 'b' are copied into c.
O(N) operation foreach char.
}

Happy Coding
- Michael Starberg
 
Hi Jeong.
The clue is in your title...

"Does String <snip> always recreate a new string object?"

Yes. While strings are a reference-type, they behave like a value-type.

Strings are immutable.

While the JITter and runtime may treat strings different, and optimize, the
specs (the interface) says that all strings are always copied.

public void example(string a, string b)
{
  string c = a + b; // c is allocated and 'a' and 'b' are copied into c.
O(N) operation foreach char.

}

Happy Coding
- Michael Starberg

Thanks for the reply. I thought maybe the runtime would do some
optimization,
since it is OK to return the same string object.
 
Hi Jeong.
The clue is in your title...

"Does String <snip> always recreate a new string object?"

Yes. While strings are a reference-type, they behave like a value-type.

No, they don't. They behave exactly like a reference type.
Strings are immutable.

That doesn't cause them to behave like a value type. Being a value type
definitely does not imply that the type is immutable (maybe in a perfect
world it would, but this is the imperfect world of C# :) ). Being
immutable doesn't cause a reference type to behave like a value type.

You are right that, being immutable, any method that returns a string
different from the string being used to call the method will necessarily
return a new instance of the string class. But please don't confuse the
question of mutable vs. immutable with the question of reference type vs.
value type. They aren't the same.

Pete
 
That doesn't cause them to behave like a value type. Being a value type
definitely does not imply that the type is immutable (maybe in a perfect
world it would, but this is the imperfect world of C# :) ). Being
immutable doesn't cause a reference type to behave like a value type.

You are right that, being immutable, any method that returns a string
different from the string being used to call the method will necessarily
return a new instance of the string class. But please don't confuse the
question of mutable vs. immutable with the question of reference type vs.
value type. They aren't the same.

Pete

Amen!

As for using classes/structs I think I have found a pretty good way on how
to attack the problem/difference.

- Michael Starberg
 
Sin Jeong-hun said:
news:81b02b5b-ee05-4cf3-b118-d0114f6fa237@t66g2000hsf.googlegroups.com...
Thanks for the reply. I thought maybe the runtime would do some
optimization,
since it is OK to return the same string object.

Well, a classic Delphi-string is very smart.
It keeps a negative offset from char of string[0 -> -3].
That keeps info length of the actual string, and even back further, the
capacity of the string. Even -back there are the number of references to it.

As Microsoft has licenced pretty much every patent Borland ever filed, I
don't see why the runtime would skip this smart way to handle strings in the
runtime.

Strings may not be as 'immutable' as they may seem, but rather follow the
copy-on-write-scheme that ObjectPascal more or less intruduced.

But that does not matter. The C# specs clearly says that strings are
immutable. Code that way!
... and let the JITter/Runtime take care of the rest. =)

Happy Coding
- Michael Starberg
 
Sin Jeong-hun said:
If I use something like this,

string html = "<h1>C# is great</h1>";
Console.WriteLine(html.Replace("&lt;","<").Replace("&gt;",">"));

Does this recreate new string objects two times, even though it has
nothing to replace, and it is OK to return the same string object?

No, Replace (string, string) does not always create a new string.

Hilton
 
Michael Starberg said:
The clue is in your title...

"Does String <snip> always recreate a new string object?"

Yes. While strings are a reference-type, they behave like a value-type.

Strings are immutable.

None of that prevents the optimisation that Sin was talking about - a
replacement where nothing changed.

It would be perfectly okay for "foo".Replace('x', 'y') to return the
original reference. It wouldn't break immutability in the slightest.
 
Sin said:
If I use something like this,

string html = "<h1>C# is great</h1>";
Console.WriteLine(html.Replace("&lt;","<").Replace("&gt;",">"));

Does this recreate new string objects two times, even though it has
nothing to replace, and it is OK to return the same string object?

I know, regular expression is a more delicate way to do this, but it
always is kind of too complicated to me.
Thank you.

Easy to test:

String s1 = "aaaaa";
String s2 = s1.Replace("b", "c");
Debug.WriteLine(Object.ReferenceEquals(s1, s2));

Outputs True for .NET 3.5 and 2.0 SP1.
 
Easy to test:

String s1 = "aaaaa";
String s2 = s1.Replace("b", "c");
Debug.WriteLine(Object.ReferenceEquals(s1, s2));

Outputs True for .NET 3.5 and 2.0 SP1.

This returns true because the CLR uses string interning to optimize
string handling.
 
This returns true because the CLR uses string interning to optimize
string handling.

No it doesn't. String interning has nothing to do with this.

To prove that:
using System;

class Test
{
static void Main()
{
string s1 = new string(new char[]{'a','a','a'});
string s2 = s1.Replace("b", "c");

Console.WriteLine(object.ReferenceEquals(s1, s2));
Console.WriteLine(string.IsInterned(s1)==null);
}
}

This prints "True" twice, which shows that s1 *isn't* interned.
 
This returns true because the CLR uses string interning to optimize
string handling.

Actually it seems to work even if the strings aren't interned, suggesting it
is instead a "I didn't update anything, return the original reference"
optimisation; since it is an internal-call, it is hard to check...

Anyways, s1 and s2 come out as ref-equals, even though it isn't interned.

Marc

string s0 = "foobar"; // expect this to be interned
Console.WriteLine("s0: {0}", s0);
Console.WriteLine("s0 interned: {0}", (string.IsInterned(s0) !=
null));
string s1 = new string('a',5); // don't expect this to be
interned
Console.WriteLine("s1: {0}", s1);
Console.WriteLine("s1 interned: {0}", (string.IsInterned(s1) !=
null));
string s2 = s1.Replace("b", "c");
Console.WriteLine("s2: {0}", s1);
Console.WriteLine("ref-equals: {0}", object.ReferenceEquals(s1,
s2));
Console.WriteLine("s1 interned: {0}", (string.IsInterned(s1) !=
null));
Console.WriteLine("s2 interned: {0}", (string.IsInterned(s2) !=
null));
 
Back
Top