Does String.Replace always recreate a new string object?

S

Sin Jeong-hun

If I use something like this,

string html = "<h1>C# is great</h1>";
Console.WriteLine(html.Replace("&lt;","<").Replace("&gt;",">"));

Does this recreate new string objects two times, even though it has
nothing to replace, and it is OK to return the same string object?

I know, regular expression is a more delicate way to do this, but it
always is kind of too complicated to me.
Thank you.
 
M

Michael Starberg

Sin Jeong-hun said:
If I use something like this,

string html = "<h1>C# is great</h1>";
Console.WriteLine(html.Replace("&lt;","<").Replace("&gt;",">"));

Does this recreate new string objects two times, even though it has
nothing to replace, and it is OK to return the same string object?

I know, regular expression is a more delicate way to do this, but it
always is kind of too complicated to me.
Thank you.


Hi Jeong.
The clue is in your title...

"Does String <snip> always recreate a new string object?"

Yes. While strings are a reference-type, they behave like a value-type.

Strings are immutable.

While the JITter and runtime may treat strings different, and optimize, the
specs (the interface) says that all strings are always copied.

public void example(string a, string b)
{
string c = a + b; // c is allocated and 'a' and 'b' are copied into c.
O(N) operation foreach char.
}

Happy Coding
- Michael Starberg
 
S

Sin Jeong-hun

Hi Jeong.
The clue is in your title...

"Does String <snip> always recreate a new string object?"

Yes. While strings are a reference-type, they behave like a value-type.

Strings are immutable.

While the JITter and runtime may treat strings different, and optimize, the
specs (the interface) says that all strings are always copied.

public void example(string a, string b)
{
  string c = a + b; // c is allocated and 'a' and 'b' are copied into c.
O(N) operation foreach char.

}

Happy Coding
- Michael Starberg

Thanks for the reply. I thought maybe the runtime would do some
optimization,
since it is OK to return the same string object.
 
P

Peter Duniho

Hi Jeong.
The clue is in your title...

"Does String <snip> always recreate a new string object?"

Yes. While strings are a reference-type, they behave like a value-type.

No, they don't. They behave exactly like a reference type.
Strings are immutable.

That doesn't cause them to behave like a value type. Being a value type
definitely does not imply that the type is immutable (maybe in a perfect
world it would, but this is the imperfect world of C# :) ). Being
immutable doesn't cause a reference type to behave like a value type.

You are right that, being immutable, any method that returns a string
different from the string being used to call the method will necessarily
return a new instance of the string class. But please don't confuse the
question of mutable vs. immutable with the question of reference type vs.
value type. They aren't the same.

Pete
 
M

Michael Starberg

That doesn't cause them to behave like a value type. Being a value type
definitely does not imply that the type is immutable (maybe in a perfect
world it would, but this is the imperfect world of C# :) ). Being
immutable doesn't cause a reference type to behave like a value type.

You are right that, being immutable, any method that returns a string
different from the string being used to call the method will necessarily
return a new instance of the string class. But please don't confuse the
question of mutable vs. immutable with the question of reference type vs.
value type. They aren't the same.

Pete

Amen!

As for using classes/structs I think I have found a pretty good way on how
to attack the problem/difference.

- Michael Starberg
 
M

Michael Starberg

Sin Jeong-hun said:
news:81b02b5b-ee05-4cf3-b118-d0114f6fa237@t66g2000hsf.googlegroups.com...
Thanks for the reply. I thought maybe the runtime would do some
optimization,
since it is OK to return the same string object.

Well, a classic Delphi-string is very smart.
It keeps a negative offset from char of string[0 -> -3].
That keeps info length of the actual string, and even back further, the
capacity of the string. Even -back there are the number of references to it.

As Microsoft has licenced pretty much every patent Borland ever filed, I
don't see why the runtime would skip this smart way to handle strings in the
runtime.

Strings may not be as 'immutable' as they may seem, but rather follow the
copy-on-write-scheme that ObjectPascal more or less intruduced.

But that does not matter. The C# specs clearly says that strings are
immutable. Code that way!
... and let the JITter/Runtime take care of the rest. =)

Happy Coding
- Michael Starberg
 
H

Hilton

Sin Jeong-hun said:
If I use something like this,

string html = "<h1>C# is great</h1>";
Console.WriteLine(html.Replace("&lt;","<").Replace("&gt;",">"));

Does this recreate new string objects two times, even though it has
nothing to replace, and it is OK to return the same string object?

No, Replace (string, string) does not always create a new string.

Hilton
 
J

Jon Skeet [C# MVP]

Michael Starberg said:
The clue is in your title...

"Does String <snip> always recreate a new string object?"

Yes. While strings are a reference-type, they behave like a value-type.

Strings are immutable.

None of that prevents the optimisation that Sin was talking about - a
replacement where nothing changed.

It would be perfectly okay for "foo".Replace('x', 'y') to return the
original reference. It wouldn't break immutability in the slightest.
 
L

Lasse Vågsæther Karlsen

Sin said:
If I use something like this,

string html = "<h1>C# is great</h1>";
Console.WriteLine(html.Replace("&lt;","<").Replace("&gt;",">"));

Does this recreate new string objects two times, even though it has
nothing to replace, and it is OK to return the same string object?

I know, regular expression is a more delicate way to do this, but it
always is kind of too complicated to me.
Thank you.

Easy to test:

String s1 = "aaaaa";
String s2 = s1.Replace("b", "c");
Debug.WriteLine(Object.ReferenceEquals(s1, s2));

Outputs True for .NET 3.5 and 2.0 SP1.
 
W

wilbain

Easy to test:

String s1 = "aaaaa";
String s2 = s1.Replace("b", "c");
Debug.WriteLine(Object.ReferenceEquals(s1, s2));

Outputs True for .NET 3.5 and 2.0 SP1.

This returns true because the CLR uses string interning to optimize
string handling.
 
J

Jon Skeet [C# MVP]

This returns true because the CLR uses string interning to optimize
string handling.

No it doesn't. String interning has nothing to do with this.

To prove that:
using System;

class Test
{
static void Main()
{
string s1 = new string(new char[]{'a','a','a'});
string s2 = s1.Replace("b", "c");

Console.WriteLine(object.ReferenceEquals(s1, s2));
Console.WriteLine(string.IsInterned(s1)==null);
}
}

This prints "True" twice, which shows that s1 *isn't* interned.
 
M

Marc Gravell

This returns true because the CLR uses string interning to optimize
string handling.

Actually it seems to work even if the strings aren't interned, suggesting it
is instead a "I didn't update anything, return the original reference"
optimisation; since it is an internal-call, it is hard to check...

Anyways, s1 and s2 come out as ref-equals, even though it isn't interned.

Marc

string s0 = "foobar"; // expect this to be interned
Console.WriteLine("s0: {0}", s0);
Console.WriteLine("s0 interned: {0}", (string.IsInterned(s0) !=
null));
string s1 = new string('a',5); // don't expect this to be
interned
Console.WriteLine("s1: {0}", s1);
Console.WriteLine("s1 interned: {0}", (string.IsInterned(s1) !=
null));
string s2 = s1.Replace("b", "c");
Console.WriteLine("s2: {0}", s1);
Console.WriteLine("ref-equals: {0}", object.ReferenceEquals(s1,
s2));
Console.WriteLine("s1 interned: {0}", (string.IsInterned(s1) !=
null));
Console.WriteLine("s2 interned: {0}", (string.IsInterned(s2) !=
null));
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top