Strings.. Objects or not???

C

Cor Ligthert

Jon,
Exactly - even if they didn't *call* them strings, the concept was
still there. I don't see why you're dismissing it as a legacy concept -
what would you replace it with?

The difference is the mutability. In Cobol it is very easy to rename your
bytearea and use it by instance to set even a bit in a completly different
way. This was done by instance when there was needed memory or simple to do
things what now is done with methods as SubString.

Later there came concepts which where more dedicated to "Variables", you
maybe think different however I hated that because you had completly no
control over the memory anymore (This is not a part of C therefore I did
express write *Methods* of C, pointing on made methods), it was in by
instance in Basic the most terrible for people like me, however with that I
do not prickle you.

:)

I find the immutability of the string a kind of legacy from that "Variable"
time, however just an idea, which does not botter me at all now we have so
much memory.
Um, I don't think so, actually. "Word" can have varying meanings.
What's more confusing is that "byte" doesn't always mean "8 bits" -
maybe that's what you meant?

It was in past always confusing when there was talked about words, so at a
certain moment suddenly everybody was talking about Bytes, while in the
beginning there were only 8bits words ment with that.

Cor
 
J

Jon Skeet [C# MVP]

Cor Ligthert said:
The difference is the mutability.

In that case, you're definitely not talking about a C/C++ legacy, as
std::strings are mutable, and in C there's nothing to stop you from
changing the memory.
In Cobol it is very easy to rename your
bytearea and use it by instance to set even a bit in a completly different
way. This was done by instance when there was needed memory or simple to do
things what now is done with methods as SubString.

Later there came concepts which where more dedicated to "Variables", you
maybe think different however I hated that because you had completly no
control over the memory anymore (This is not a part of C therefore I did
express write *Methods* of C, pointing on made methods), it was in by
instance in Basic the most terrible for people like me, however with that I
do not prickle you.

Didn't understand any of that, but never mind.
I find the immutability of the string a kind of legacy from that "Variable"
time, however just an idea, which does not botter me at all now we have so
much memory.

It's certainly not a C/C++ legacy. Java has immutable strings, and I
think they're wonderful - I'd rather not worry about other methods
changing the contents of my strings, etc. It also means that the amount
of memory required doesn't change - if you wanted to expand a string,
it could require relocation and copying, etc, which is a general pain.

<snip>
 
C

Cor Ligthert

Jon,
It's certainly not a C/C++ legacy.

That is what I tried to write in the part you did not understand.

And with that denying a previous message of me where I pointed on that while
afterwards I thought doh, wrong.

I think that all is clear now, can it now be EOT?

Cor
 
M

Michael Giagnocavo [MVP]

Not always. The literals point to an offset in the stringheap. So as long as
the offsets into the string heap are the same, we know the strings are the
same. Not having reviewed the source code for this part of the CLR, I'm not
sure how they handle it, but I'd be highly surprised if they skip this
optimization.

-Michael
MVP
www.atrevido.net
 
G

Guest

I know you are talking about VB.net but check this, in VC++ there is a compiler option called /Gf or /GF, that creates single copy of identical strings in the program image and memory during execution, resulting in smaller programs, an optimization called *string pooling*.
I think this optimization is already built in to VB.net compiler and hence you get this kind of behavior in VB apps.
Any body plz correct me if I'm wrong.

Hope that helps.

Abubakar.
http://joehacker.blogspot.com
 
B

Bob Grommes

I think you are talking about string interning, or the string interning
pool, or just plain string pool. This is automatically the behavior of C#
as well as VB.NET -- for constants. You can also take advantage of it for
any string you manipulate, by using String.Intern(). The disadvantage is
that it can slow string assignments down (due to the overhead of searching
the string pool to see if the string needs to be added or if an existing
reference can be returned). However, in many instances this
often-overlooked technique can save tremendous amounts of memory. Many
tables of string values have a lot of repetition.

--Bob

Abubakar said:
I know you are talking about VB.net but check this, in VC++ there is a
compiler option called /Gf or /GF, that creates single copy of identical
strings in the program image and memory during execution, resulting in
smaller programs, an optimization called *string pooling*.
I think this optimization is already built in to VB.net compiler and hence
you get this kind of behavior in VB apps.
 
D

David

Ok, so if

string y1 = "abcdefghijklmnopqrstuvwxy"
string y2 = "abcdefghijklmnopqrstuvwxyz"

That means that y1 is created, then a search algorithm does a string search
all the way to 'y' and then says -- opps, gotta create a new object.

Man. Talk about /overhead/

Sure, but it's compile-time overhead. Which isn't really a big deal.
 
D

David

I believe the String is a true object, but it Overloads the '=' Operator.
so when you write the code x1=x2 the and x1 and x2 are both strings, it will
actually compile the same as x1 is x2.

It really won't.

Dim s as String = "123"
Dim s2 as String = "1234"
s = s & "4"

If s is s2 Then
Console.WriteLine("is")
End If

If s = s2 Then
Console.WriteLine("=")
End If
 
C

Chris Mullins

[Strings]
Dim s as String = "123"
Dim s2 as String = "1234"
s = s & "4"

If s is s2 Then
Console.WriteLine("is")
End If

If s = s2 Then
Console.WriteLine("=")
End If

Strings in .NET are weird. Even this code doesn't do what you probably think
it does.

First off, string are immutable - once created, they're not changed. Only
new strings are created. Interning of strings confuses the issues quite a
bit.

A fairly good overview seems to be at:
http://www.sliver.com/dotnet/emails/default.aspx?id=6

Richter, in his .NET book, has a pretty good explination of strings as well.
He also gets into the encoding (UTF8/16) issues surrounding strings
including the StringInfo class and all sorts of other goodies.
 
P

Peter Bromley

I believe the String is a true object, but it Overloads the '=' Operator.
It really won't.

Dim s as String = "123"
Dim s2 as String = "1234"
s = s & "4"

If s is s2 Then
Console.WriteLine("is")
End If

If s = s2 Then
Console.WriteLine("=")
End If

So the lesson here is don't ever use the '=' Operator with strings.

You must use the Equals method for comparison.

Personally speaking, I would have thought it made more sense that the
string class overrode the '=' operator so that it behaved as the Equals
method (thus making the class behave more like a value class) but there
must have been good reasons to do things the way they did....

--
If you wish to reply to me directly, my addres is spam proofed as:

pbromley at adi dot co dot nz

Or if you prefer - (e-mail address removed) :)
 
D

David

So the lesson here is don't ever use the '=' Operator with strings.

You must use the Equals method for comparison.

Well, that's not the lesson I'd take. I pretty much use '=' exclusively,
which IMHO does exactly what one would presume it does. In general,
you're usually interested in equality, not identity, and I find that
this is especially true with strings.
Personally speaking, I would have thought it made more sense that the
string class overrode the '=' operator so that it behaved as the Equals
method (thus making the class behave more like a value class) but there
must have been good reasons to do things the way they did....

It does behave as the Equals method. In the above example,

s.Equals(s2)
Object.Equals(s, s2)
s = s2

are all true. Only 's is s2' is false.
 
P

Peter Bromley

Well, that's not the lesson I'd take. I pretty much use '=' exclusively,
which IMHO does exactly what one would presume it does. In general,
you're usually interested in equality, not identity, and I find that
this is especially true with strings.
Well, it's the lesson I painfully learned some months ago :)
It does behave as the Equals method. In the above example,

s.Equals(s2)
Object.Equals(s, s2)
s = s2

are all true. Only 's is s2' is false.

Perhaps there is some difference between VB and C++ but I was
conclusively bitten by my assumption that == and .Equals did the same
thing for Strings.

If you look at the il for the following (C++) code
System::String* s = S"123";
System::String* s2 = S"1234";
s = System::String::Concat(s, S"4");
bool equal = s == s2;
equal = s->Equals(s2);

The == test compiles to "ceq" on the pointers s1 and s2 and not to a
call to op_Equality as documented in MSDN. Perhaps this is a bug....

I'm curious, what does your VB code compile to for the s = s2 example?

--
If you wish to reply to me directly, my addres is spam proofed as:

pbromley at adi dot co dot nz

Or if you prefer - (e-mail address removed) :)
 
J

Jon Skeet [C# MVP]

Peter Bromley said:
Perhaps there is some difference between VB and C++ but I was
conclusively bitten by my assumption that == and .Equals did the same
thing for Strings.

If you look at the il for the following (C++) code
System::String* s = S"123";
System::String* s2 = S"1234";
s = System::String::Concat(s, S"4");
bool equal = s == s2;
equal = s->Equals(s2);

The == test compiles to "ceq" on the pointers s1 and s2 and not to a
call to op_Equality as documented in MSDN. Perhaps this is a bug....

I'm afraid I don't know whether MC++ is meant to use the overloaded ==
operator in the same way that C# does. (Using == in the C# version of
the above would be fine.)
I'm curious, what does your VB code compile to for the s = s2 example?

It compiles to a call to
Microsoft.VisualBasic.CompilerServices.StringType.Strcmp(s, s2, false).
 
D

David

Perhaps there is some difference between VB and C++ but I was
conclusively bitten by my assumption that == and .Equals did the same
thing for Strings.

The difference is that the vb '=' and the c++/c# '==' are not the same
operators. in VB, the '=' operator compares strings for equality.
Technically, there are some differences between it and .Equals, but
for the most part they can be treated as if they did the same thing.
If you look at the il for the following (C++) code
System::String* s = S"123";
System::String* s2 = S"1234";
s = System::String::Concat(s, S"4");
bool equal = s == s2;
equal = s->Equals(s2);

The == test compiles to "ceq" on the pointers s1 and s2 and not to a
call to op_Equality as documented in MSDN. Perhaps this is a bug....

I'm curious, what does your VB code compile to for the s = s2 example?

As Jon said, it compiles to
Microsoft.VisualBasic.CompilerServices.StringType::StrCmp(s,s2,false)

The 'Is' operator compiles to a ceq instruction.
 
P

Peter Bromley

Jon said:
I'm afraid I don't know whether MC++ is meant to use the overloaded ==
operator in the same way that C# does. (Using == in the C# version of
the above would be fine.)
Well, from my reading of MSDN, if a type has an static op_Equality
member defined, then "type == type" should always compile to a call to
that op_Equality member. This has always been the case for value types
(AFAICT) but isn't the case for String. I might do some more research to
see whether this mapping occurs for any heap objects.

Cheers,


--
If you wish to reply to me directly, my addres is spam proofed as:

pbromley at adi dot co dot nz

Or if you prefer - (e-mail address removed) :)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top