Abstract class variables question

P

Peter Duniho

I understand that it is a field. But I thought that since the _objCurrent
is an object type that it is just a pointer but not actually an object until
it is assigned to something so it is null.

I'm not really sure what that has to do with whether the variable is
stored on the stack or not.

_objCurrent isn't on the stack, which is all I was pointing out.
As far as boxing and the heap go, I assumed it was a pointer to the heap
from articles such as this:

http://www.c-sharpcorner.com/Upload...rticleID=967b4ba5-e4f6-4b80-8e6e-0f4cc23f8e6c
I

may not have understood it completely.

Ugh. The guy lost me on the comment that read "changing i on the
heap". I don't know whether it's because he doesn't undersatnd, or
he's just not effective at communicating what he does understand, but
his first example of boxing is very misleading.

The boxing of the variable "i" does not in any way relate the variable
"i" to something on the heap. It _copies_ the value from the variable
"i" into a completely new data structure. Once this is done, "i" is
completely irrelevant to the reference instance. It's very misleading
to refer to that instance as being "i on the heap".

It's very important to understand, value types are almost always
copied. The only exception is when passing a value type instance by
reference ("ref" or "out"), and even then you don't get a reference
that you can explicitly pass around as a reference the way you can with
reference types.

So, when a value type is boxed, the boxed value is a whole new copy of
the original value type. It's sort of like doing something like this:

class BoxedInt
{
int _value;

BoxedInt(int value)
{
_value = value;
}

public int Value
{
get { return _value; }
}
}

And then somewhere in code, doing this:

int i = 5;
BoxedInt boxed = new BoxedInt(i);

The current value of "i" is passed into the constructor, where it's
copied to the private field "_value". You can read it back out, which
then copies the value from the private field to wherever you assign it.
But "i" is only relevant when the instance is first constructed, and
there only as it's passed by value to the constructor.

Really. :)
Does it have more information about the actual object, such as what type of
object it is?

All that information is with the object itself, not the variable that
refers to it. The variable referencing the object doesn't need to
store it at all, because the information can always be obtained from
the instance data itself.
That is what is confusing about boxing.

According to the article above, unless I missed something, boxing actually
moves the data and you are looking at a different piece of data.

Define "move". To me, "move" means you've removed something from one
place, and put it somewhere else.

That's definitely _not_ what happens with boxing.

Instead, boxing makes a whole new copy of the original value type data,
wraps it up in a reference type instance, and gives you the reference
to that instance back.
For example:

string A = "something"
string B;
B = A

Both A and B are referencing the same location. Such that if I say 'A =
"Something else";', A and B will both be equal to "Something Else".

Bad example, for a few reasons:

1) the String type is a class, meaning it's a reference type,
meaning it never gets boxed
2) the String type is immutable, meaning that if you write code
that assigns a new value to A, the string referenced by B definitely
will _not_ change (it will still refer to "something").
3) In C#, you can't overload or override the assignment operator,
and so an assignment to a reference type variable will always replace
the reference. The example you've offered will never ever work in C#,
even for a mutable class.

So, what sort of example would work? Here's one that I think gets at
what you're trying to show:

class Mutable
{
string _str;

Mutable(string str)
{
_str = str;
}

public string String
{
get { return _str; }
set { _str = value; }
}
}

Then:

Mutable A, B;

A = new Mutable("something");
B = A;
A.String = "something else";

In _that_ case, then yes..."B.String" will now also return "something else".
But if you do
string A = "something";
object B;
B = A;
A = "Something Else"

A and B point to different locations whereas A will be "Something Else" and
B will be "something". Both A and B are pointers in both examples.

Assuming your first example did what you wanted, then the second
example still would not. That is, simply changing the type of B from
"string" to "object" doesn't cause boxing to happen. Applying the same
idea to the valid example I offered, you get something like this:

Mutable A;
object B;

A = new Mutable("something");
B = A;
A.String = "something else";

In that case, your statement that "B will be 'something'" is not true.
B continues to reference the same instance as A, and having changed
that instance so that it contains "something else" instead, the
instance of Mutable that B refers to will contain "something else"
(because it's the same instance).

Now, as long as you keep that reference in a variable typed as
"object", you have no easy way to get at the contained string. But
it's there, nonetheless. And you can in fact get it back simply by
casting B back to the Mutable type that it is:

Mutable A, C;
object B;

A = new Mutable("something");
B = A;
A.String = "something else";
C = (Mutable)B;
Console.WriteLine(C.String);

This will print "something else" to the console. More significantly,
once you've initialized A, the actual order of the change to A is
irrelevant. The only requirement is that A is initialized before being
assigned to B, and B is initialized (assigned) before being assigned to
C. The line that changes the instance can be anywhere after A is
initialized, and the code will have the same exact output, and in fact
will do all of the exact same work (just in a different order).

Confused yet? :)

Here's the thing: what you've posted seems to imply that you believe
that any time you assign something to "object", it's boxed. But that's
not true at all. Boxing _only_ happens when you assign a _value_ type
to an "object" variable. Because of the inheritance that is allowed
with reference types, and because "object" is the lowest base class for
every reference types, any reference instance can be assigned to a
variable of type "object" without any conversion at all. It's already
an "object".

For reference types, it's no different than if you assigned a
"BoolType" instance to a "DataType" instance (using the classes we
started with here). No conversion happens, the reference is simply
assigned from the "BoolType" variable to the "DataType" variable.

The "object" type is also the base class for value types, but only in a
strange, "kind of" sort of way. Value types don't have inheritance, so
it's sort of weird to say that a value type has a base class. But in
C#, they do. Even so, because value types aren't references, you can't
assign a value type to a reference variable typed as "object" without
doing something to bridge the gap.

Boxing is what does that bridging. It copies the value into a whole
new data structure, a structure that is itself a valid reference type.
Only _then_ can the reference, newly created, be assigned to a variable
of type "object".

Now, back to the example you tried to construct, for it to work you
first need to start with an actual value type, so that some boxing will
take place. If you do, you might wind up with something like this:

int A, B

A = 5;
B = A;
A = 10;

Of course, for the example to work you'd like for changing A to also
change B. But that's not how value types work. So, even though the
above example hasn't even had the boxing introduced to it, you've
already got a situation where changing the original doesn't change the
previously assigned variable.

In that respect, other than the overhead of the boxing, boxing is
actually very much like how value types operate normally. Yes, boxing
creates a copy of the value type. But then, so does any assignment of
a value type. And likewise, since a boxed value type is immutable,
even after you've boxed a value type, there's no way to have multiple
variables reference the data in a way where you can change the data via
one variable and see the change via a different variable.

In other words, it's only going to be confusing if you think boxing
might do something above and beyond just creating a reference-able
instance of a value type.
Maybe it is different with ints and strings.

The int type and string type are _definitely_ handled differently. An
int is a value type, and so to assign an int instance to an object
variable, it has to be boxed. But a string is a reference type, and
assigning a string instance to an object variable requires no such work.

Pete
 
T

tshad

I think I understand boxing a little better now.

After reading your information and a couple of other articles - I think I
have a better idea what it is and why it is used. The problem was the WHY.
I had seen other articles on the HOW but not much on the why:

int i,j;
object o = i; // Boxed
j = (object) o/ // unboxed.

This shows how it is done but not why you would use it.

You're right in that I didn't understand the string type not being boxed and
the values being boxed and why they were different. But now if I understand
it correctly:

1) value types are stored on the stack (faster)
2) reference types are stored on the heap and have a reference/pointer to
the object that is on the heap.
3) value types are copied to the heap and made into an object and reference
type (in essence) that now has a reference/pointer pointing to the new
object.
4) String types are already reference types and all we are doing when we do
"object o = str" is create a new reference/pointer that points to the object
on the heap.

In the case of a non-string object, the object is the same but we have 2
pointers to that same object. If the object changes value and both
references point to the same object, they will be the same.

But in the case of a string, which is immutable, the 1st reference would
point to the first string and the 2nd reference (after the string is
changed) will point to the new string. And that would be the case, even if
we didn't move it to a new object.

string A = "something";
string B;
B = A;
A += " Else";

gives me the same result as:

string s1 = "something";
object o1 = s1;
s1 += " Else";

Where A <> B and o1 <> s1 because in both cases a new string is created even
though we think we are changing the strings.

I also found a better article/tutorial which I think was trying to say the
same thing as the first one but this one does it better. He isn't
addressing the string reference type but address reference types in general.
http://www.jaggersoft.com/csharp_course/13_Boxing_files/frame.htm

I think he is saying the same thing you are saying (but he also doesn't go
into the string reference type).

Hopefully I am getting it right this time.

Peter Duniho said:
I'm not really sure what that has to do with whether the variable is
stored on the stack or not.

_objCurrent isn't on the stack, which is all I was pointing out.

I assume that _objCurrent isn't on the stack because the object is not on
the stack and all reference types are stored on the heap. But is
_objCurrent just a pointer (reference) to the actual object (on the heap).
Ugh. The guy lost me on the comment that read "changing i on the heap".
I don't know whether it's because he doesn't undersatnd, or he's just not
effective at communicating what he does understand, but his first example
of boxing is very misleading.

I think the communication was the problem. He does mention that the value
type is converted to a reference type. But he doesn't mention (at least I
didn't see it) that this is only concerned with value types.
The boxing of the variable "i" does not in any way relate the variable "i"
to something on the heap. It _copies_ the value from the variable "i"
into a completely new data structure. Once this is done, "i" is
completely irrelevant to the reference instance. It's very misleading to
refer to that instance as being "i on the heap".

It may be misleading but I think the value that gets copied does get copied
to the heap. But in the case of an Class Object that is already on the
heap, you are copying the value from a heap location to a heap location.
It's very important to understand, value types are almost always copied.
The only exception is when passing a value type instance by reference
("ref" or "out"), and even then you don't get a reference that you can
explicitly pass around as a reference the way you can with reference
types.

So, when a value type is boxed, the boxed value is a whole new copy of the
original value type. It's sort of like doing something like this:

class BoxedInt
{
int _value;

BoxedInt(int value)
{
_value = value;
}

public int Value
{
get { return _value; }
}
}

And then somewhere in code, doing this:

int i = 5;
BoxedInt boxed = new BoxedInt(i);

The current value of "i" is passed into the constructor, where it's copied
to the private field "_value". You can read it back out, which then
copies the value from the private field to wherever you assign it. But "i"
is only relevant when the instance is first constructed, and there only as
it's passed by value to the constructor.

So in this case I have "i" which is a value Type and boxed.i (if i were
public) which are completely different even if they have the same value at
the start. If I then change "i" to 30, boxed.i will still = 5.
Really. :)


All that information is with the object itself, not the variable that
refers to it. The variable referencing the object doesn't need to store
it at all, because the information can always be obtained from the
instance data itself.


Define "move". To me, "move" means you've removed something from one
place, and put it somewhere else.

My mistake.

I meant copy and not move. Which is why they would be different - which is
what the guy in the original article was trying to point out. That when you
box the value type, the original value is still the same in the same place
(on the stack in his example) and the boxed value is a DIFFERENT variable
(on the heap) with the same value at the moment but since they are actually
different variables, if you change one you don't affect the other.
That's definitely _not_ what happens with boxing.

Instead, boxing makes a whole new copy of the original value type data,
wraps it up in a reference type instance, and gives you the reference to
that instance back.


Bad example, for a few reasons:

I understand now why this is a bad example.
1) the String type is a class, meaning it's a reference type, meaning
it never gets boxed
2) the String type is immutable, meaning that if you write code that
assigns a new value to A, the string referenced by B definitely will _not_
change (it will still refer to "something").
3) In C#, you can't overload or override the assignment operator, and
so an assignment to a reference type variable will always replace the
reference. The example you've offered will never ever work in C#, even
for a mutable class.

So, what sort of example would work? Here's one that I think gets at what
you're trying to show:

class Mutable
{
string _str;

Mutable(string str)
{
_str = str;
}

public string String
{
get { return _str; }
set { _str = value; }
}
}

Then:

Mutable A, B;

A = new Mutable("something");
B = A;
A.String = "something else";

In _that_ case, then yes..."B.String" will now also return "something
else".


Assuming your first example did what you wanted, then the second example
still would not. That is, simply changing the type of B from "string" to
"object" doesn't cause boxing to happen. Applying the same idea to the
valid example I offered, you get something like this:

Mutable A;
object B;

A = new Mutable("something");
B = A;
A.String = "something else";

In that case, your statement that "B will be 'something'" is not true. B
continues to reference the same instance as A, and having changed that
instance so that it contains "something else" instead, the instance of
Mutable that B refers to will contain "something else" (because it's the
same instance).
I got it.

If the string is inside the object and A and B both pointed at the object
even if the string changed INSIDE the object, since A and B are both
pointing at the same object with the string changed and therefore A.String
and B.String would be the same.

But in the above case:

string A = "something";
object B;
B = A;
A = "Something Else"

You really do have 2 references to 2 different objects (a string and and an
object) and NOT 2 references to the same object. And because of the way a
string works when you change A you actually get a 3rd object and the
original object is dereferenced by 1 and if nothing else is referencing it,
the GC will get rid of it.
Now, as long as you keep that reference in a variable typed as "object",
you have no easy way to get at the contained string. But it's there,
nonetheless. And you can in fact get it back simply by casting B back to
the Mutable type that it is:

Mutable A, C;
object B;

A = new Mutable("something");
B = A;
A.String = "something else";
C = (Mutable)B;
Console.WriteLine(C.String);

This will print "something else" to the console. More significantly, once
you've initialized A, the actual order of the change to A is irrelevant.
The only requirement is that A is initialized before being assigned to B,
and B is initialized (assigned) before being assigned to C. The line that
changes the instance can be anywhere after A is initialized, and the code
will have the same exact output, and in fact will do all of the exact same
work (just in a different order).

Confused yet? :)

I think I have it. In the above example, I think A = B = C, since all 3 are
reference types and point to the same object (and none are strings).
Here's the thing: what you've posted seems to imply that you believe that
any time you assign something to "object", it's boxed. But that's not
true at all. Boxing _only_ happens when you assign a _value_ type to an
"object" variable. Because of the inheritance that is allowed with
reference types, and because "object" is the lowest base class for every
reference types, any reference instance can be assigned to a variable of
type "object" without any conversion at all. It's already an "object".

For reference types, it's no different than if you assigned a "BoolType"
instance to a "DataType" instance (using the classes we started with
here). No conversion happens, the reference is simply assigned from the
"BoolType" variable to the "DataType" variable.

The "object" type is also the base class for value types, but only in a
strange, "kind of" sort of way. Value types don't have inheritance, so
it's sort of weird to say that a value type has a base class. But in C#,
they do. Even so, because value types aren't references, you can't assign
a value type to a reference variable typed as "object" without doing
something to bridge the gap.

Boxing is what does that bridging. It copies the value into a whole new
data structure, a structure that is itself a valid reference type. Only
_then_ can the reference, newly created, be assigned to a variable of type
"object".

Now, back to the example you tried to construct, for it to work you first
need to start with an actual value type, so that some boxing will take
place. If you do, you might wind up with something like this:

int A, B

A = 5;
B = A;
A = 10;

Of course, for the example to work you'd like for changing A to also
change B. But that's not how value types work. So, even though the above
example hasn't even had the boxing introduced to it, you've already got a
situation where changing the original doesn't change the previously
assigned variable.

In that respect, other than the overhead of the boxing, boxing is actually
very much like how value types operate normally. Yes, boxing creates a
copy of the value type. But then, so does any assignment of a value type.
And likewise, since a boxed value type is immutable, even after you've
boxed a value type, there's no way to have multiple variables reference
the data in a way where you can change the data via one variable and see
the change via a different variable.

In other words, it's only going to be confusing if you think boxing might
do something above and beyond just creating a reference-able instance of a
value type.


The int type and string type are _definitely_ handled differently. An int
is a value type, and so to assign an int instance to an object variable,
it has to be boxed. But a string is a reference type, and assigning a
string instance to an object variable requires no such work.

But the string is a special type of reference type (immutable) in that you
can't just assign the reference to another string and assume they will stay
the same, since any change to the string will now change the object the
reference is pointing to. Correct?

Thanks,

Tom
 
P

Peter Duniho

I think I understand boxing a little better now.

After reading your information and a couple of other articles - I think I
have a better idea what it is and why it is used. The problem was the WHY.
I had seen other articles on the HOW but not much on the why:

int i,j;
object o = i; // Boxed
j = (object) o/ // unboxed.

This shows how it is done but not why you would use it.

Has the "why" been adequately explained yet? Or would you like
elaboration on that?
You're right in that I didn't understand the string type not being boxed and
the values being boxed and why they were different. But now if I understand
it correctly:

1) value types are stored on the stack (faster)

Value types are stored on the stack when they are local variables. But
as I think you've seen elsewhere in this thread, a value type can exist
inside a class and in that case the value type is stored in the heap
with the rest of the class instance.

But as far as the "faster" goes, yes...to some extent value types have
less overhead than reference types, and so can perform better in
certain cases.
2) reference types are stored on the heap and have a reference/pointer to
the object that is on the heap.

Right. That is, a reference type variable itself would still be stored
on the stack if it's a local variable, but only the reference is stored
on the stack. The instance itself is stored in the heap.
3) value types are copied to the heap and made into an object and reference
type (in essence) that now has a reference/pointer pointing to the new
object.

Right, that's boxing.
4) String types are already reference types and all we are doing when we do
"object o = str" is create a new reference/pointer that points to the object
on the heap.

Depends on your definition of "create". My view is that the assignment
doesn't create anything. The source and destination already exist, and
the reference is copied from one to the other.

But yes, in another sense a new copy of the reference is "created",
replacing whatever was in the variable before (and in that sense, if
the reference is "created", the previous value of the variable is
"destroyed").
In the case of a non-string object, the object is the same but we have 2
pointers to that same object. If the object changes value and both
references point to the same object, they will be the same.

Well, that would be true for a string object too, if there was any way
to actually change a string.

Likewise, if you assign a new reference to a reference type variable,
that doesn't affect other variables that might have referenced the same
object that variable previously referenced. For example:

object A = new object();
object B = A;

A = new object();

That's just like the string case. Assigning a new reference to A
doesn't change what B references. This works for _any_ reference type
variable.

The thing about the string class is that assigning a new string
instance to a string type variable is the _only_ way to change the
string value of that variable. Mutable classes, you can call some
method or set a property or something like that, and that will change
the instance itself, without requiring you to assign a new instance of
the class to a variable.

I realize that the distinction might seem subtle, or even pedantic.
But I think it's a very important distinction. In particular, it's
very important to understand that with respect to the code we've seen
in this thread, the string class isn't really so different from other
classes. Because it's immutable, you _have_ to write code like we've
seen here, while other classes might provide other options. But even
with a mutable class, if you write code like we've written here, it'll
behave just like the string class (except of course for the one example
I showed specifically designed to change a class instance without
reassiging the variable).
But in the case of a string, which is immutable, the 1st reference would
point to the first string and the 2nd reference (after the string is
changed) will point to the new string. And that would be the case, even if
we didn't move it to a new object.

Even if we didn't move what to a new object?

As far as "the string is changed", I think it's important to
understand, the original string isn't changed. You create a whole new
string instance, and the string type variable itself is changed to
reference the new string instance.

Again, maybe it seems pedantic, but I think it's important to
understand the difference between the variable referencing some data
and the data itself, and along with that to make sure the language one
is using is clear about that difference.
string A = "something";
string B;
B = A;
A += " Else";

gives me the same result as:

string s1 = "something";
object o1 = s1;
s1 += " Else";

Where A <> B and o1 <> s1 because in both cases a new string is created even
though we think we are changing the strings.

Well, you might think you are changing the strings. :)

Seriously though, it is practically always the case that when you are
writing an assignment to a reference, you're replacing the reference
held by the variable. The obvious exception is of course where for
whatever reason your code sometimes assigns a reference to a variable
when it already holds that reference. Another example, one that I hope
is _extremely_ uncommon, would be in the case of an operator that's
overloaded to return the original instance reference for one of the
operands.

But as a general rule of thumb, changing an instance will go through
some member of the instance, while changing a reference value will
involve the assignment operator. The string class has no member of the
class that allows the instance to be changed, which is what makes it
immutable. But the "changing a reference value" is the same for the
string class as it is for any other class, immutable or not.
I also found a better article/tutorial which I think was trying to say the
same thing as the first one but this one does it better. He isn't
addressing the string reference type but address reference types in general.
http://www.jaggersoft.com/csharp_course/13_Boxing_files/frame.htm

I think he is saying the same thing you are saying (but he also doesn't go
into the string reference type).

Yes. I didn't care for the web design very much :), but it seems like
a better discussion of the concept.
[...]
_objCurrent isn't on the stack, which is all I was pointing out.

I assume that _objCurrent isn't on the stack because the object is not on
the stack and all reference types are stored on the heap. But is
_objCurrent just a pointer (reference) to the actual object (on the heap).

Your assumption is correct. And yes, _objCurrent is a variable that
contains a reference to the actual object on the heap.
[...]
The boxing of the variable "i" does not in any way relate the variable "i"
to something on the heap. It _copies_ the value from the variable "i"
into a completely new data structure. Once this is done, "i" is
completely irrelevant to the reference instance. It's very misleading to
refer to that instance as being "i on the heap".

It may be misleading but I think the value that gets copied does get copied
to the heap. But in the case of an Class Object that is already on the
heap, you are copying the value from a heap location to a heap location.

I think you lost me. I'm not sure what "Class Object" refers to (maybe
something in the previously referenced web page?), but assuming it's a
class, it's a reference type and assigning the reference to the
instance to any variable just copies the reference, not the class
instance itself.

If that's what you meant, then I'll agree with that. :)
[...]
The current value of "i" is passed into the constructor, where it's copied
to the private field "_value". You can read it back out, which then
copies the value from the private field to wherever you assign it. But "i"
is only relevant when the instance is first constructed, and there only as
it's passed by value to the constructor.

So in this case I have "i" which is a value Type and boxed.i (if i were
public) which are completely different even if they have the same value at
the start. If I then change "i" to 30, boxed.i will still = 5.
Exactly.
[...]
In that case, your statement that "B will be 'something'" is not true. B
continues to reference the same instance as A, and having changed that
instance so that it contains "something else" instead, the instance of
Mutable that B refers to will contain "something else" (because it's the
same instance).
I got it.

If the string is inside the object and A and B both pointed at the object
even if the string changed INSIDE the object, since A and B are both
pointing at the same object with the string changed and therefore A.String
and B.String would be the same.
Right.

But in the above case:

string A = "something";
object B;
B = A;
A = "Something Else"

You really do have 2 references to 2 different objects (a string and and an
object) and NOT 2 references to the same object.

Well, the _variable_ type is object for the second reference, but after
the third line ("B = A") you still only have one object: the original
string ("something").

Using the variable B you could only do things with the instance that
are defined in the class Object. But the instance is still a string.

After the fourth line, you now have two objects, both of them strings.
The original string is now only accessible via a variable typed as
Object, so you can only use the things defined in the Object type. But
you could cast the reference held in B back to a string, and you'd then
be able to do all the things the class String defines with the instance.
And because of the way a
string works when you change A you actually get a 3rd object

No, only a second object. See above.
and the
original object is dereferenced by 1 and if nothing else is referencing it,
the GC will get rid of it.

For what it's worth: there's no ref-counting in the GC. The garbage
collector just scans the variables referencing data, and for any object
for which it doesn't find something referencing it, it collects that
object (to oversimplify the mechanism a bit :) ).

I'm not sure what you mean by "nothing else". To me, that means
"nothing other than B", and if so the statement's not correct. B
referencing the instance is sufficient to prevent it from being
collected. If B is the only variable referencing the instance however,
then once B is assigned something else, or simply goes out of scope,
then yes...the original instance can be collected.
[...]
Mutable A, C;
object B;

A = new Mutable("something");
B = A;
A.String = "something else";
C = (Mutable)B;
Console.WriteLine(C.String);

[...]
I think I have it. In the above example, I think A = B = C, since all 3 are
reference types and point to the same object (and none are strings).

That's right.
[...]
Maybe it is different with ints and strings.

The int type and string type are _definitely_ handled differently. An int
is a value type, and so to assign an int instance to an object variable,
it has to be boxed. But a string is a reference type, and assigning a
string instance to an object variable requires no such work.

But the string is a special type of reference type (immutable) in that you
can't just assign the reference to another string and assume they will stay
the same, since any change to the string will now change the object the
reference is pointing to. Correct?

Sort of. I think I explained this above, but just to reiterate:

Knowing that the String class is immutable, you know that if two or
more variables do reference the same instance, that that instance will
never change.

In that respect it's not correct to write "any change to the string",
since there's no such thing. A string is immutable; there can be no
change to the string.

As far as an assignemnt goes, if you assign a new instance to one of
the variables, none of the other variables will be affected by the
assignment. But that is true for any class, whether mutable or
immutable.

Having a class that's immutable gives you a certain guarantee for a
given instance, and thus for multiple variables all referencing that
instance. And it also places a specific requirement on how you can
change the data a particular variable describes: you can only replace
the reference with a new reference to a different instance, rather than
modifying the instance itself.

Basically: immutable places a requirement that an assignment be used
for specific kinds of behavior, but it doesn't actually affect what
happens during an assignment. Assignments have the same effect for
mutable classes as they do for immutable classes.

Pete
 
T

tshad

Peter Duniho said:
Has the "why" been adequately explained yet? Or would you like
elaboration on that?

I am always looking for different views on the issue as each one helps a
little more, as in this case. This excercise and various articles have
helped me to understand it.

I guess the question is not only WHY but WHEN. Obviously, it is the case
here.

I did try this class with my data in place of my old NullHandler and I think
this class is going to do what I need.

Here was what I found:

Instead of:

NullHandler.GetValueFromDbObject((object)user["UserID"],ref userID);

I found I can do:

StringType lastName = new StringType();
StringType email2 = new StringType();
lastName.Data = (object)dbReader["FirstName"];
email2.Data = (object)dbReader["Email2"];

At first I tried to cast them as string:
lastName.Data = (string)dbReader["FirstName"];
email2.Data = (string)dbReader["Email2"];

And that works fine, unless the value is null. In that case I get an error:

"Specified cast is not valid".

Not sure why that is since a string can be null.

string stemp;
stemp = null;

Works fine. So why would (string)dbReader["Email2"] not work?

Casting it as (object) works whether it is a null or not. But when it was
cast as an object I did get an error at the ValidateType function at the
line:

if (!typeRequired.IsInstanceOfType(obj))
{
throw new ArgumentException("assigned value type of " +
obj.GetType().Name + " is incompatible with required type of " +
typeRequired.Name);
}

The error is:

"assigned value type of DBNull is incompatible with required type of
String"

Once I saw that I realized I would need to change the line before call to
the ValidateType function to:

if ((value != null) && (value != DBNull.Value))
{
_ValidateType(value);
}
and that fixes that.

I also have to look at the TypedData properties also as they also give me
the Invalid Cast for it also - which if you remember is:

get { return (string)Data; }
Depends on your definition of "create". My view is that the assignment
doesn't create anything. The source and destination already exist, and
the reference is copied from one to the other.

But yes, in another sense a new copy of the reference is "created",
replacing whatever was in the variable before (and in that sense, if the
reference is "created", the previous value of the variable is
"destroyed").

What I meant was that the statement creates a new reference "o" that will
point to the same location as "str" which is where the string actually is.
Well, that would be true for a string object too, if there was any way to
actually change a string.
Right.

But what I was saying that it isn't the same as the string since if you
change the string in any way "another" object is created and then one of the
references would be pointing at the original string and the other reference
would be pointing at the new string.
Likewise, if you assign a new reference to a reference type variable, that
doesn't affect other variables that might have referenced the same object
that variable previously referenced. For example:

object A = new object();
object B = A;

A = new object();

Right, but that is because YOU are creating a new object. In the string
case, the new object is being created because you changed the string. In a
non-string object, this would not be the case. If you said B.something =
15, both A and B would still point to the same object.
That's just like the string case. Assigning a new reference to A doesn't
change what B references. This works for _any_ reference type variable.

The thing about the string class is that assigning a new string instance
to a string type variable is the _only_ way to change the string value of
that variable. Mutable classes, you can call some method or set a
property or something like that, and that will change the instance itself,
without requiring you to assign a new instance of the class to a variable.

I realize that the distinction might seem subtle, or even pedantic. But I
think it's a very important distinction. In particular, it's very
important to understand that with respect to the code we've seen in this
thread, the string class isn't really so different from other classes.
Because it's immutable, you _have_ to write code like we've seen here,
while other classes might provide other options. But even with a mutable
class, if you write code like we've written here, it'll behave just like
the string class (except of course for the one example I showed
specifically designed to change a class instance without reassiging the
variable).

Right.

You are talking about a value type, I assume. Since you will have a value
type variable and a reference type variable.
Even if we didn't move what to a new object?

Not sure what I meant there either :(
As far as "the string is changed", I think it's important to understand,
the original string isn't changed. You create a whole new string
instance, and the string type variable itself is changed to reference the
new string instance.

Again, maybe it seems pedantic, but I think it's important to understand
the difference between the variable referencing some data and the data
itself, and along with that to make sure the language one is using is
clear about that difference.

I agree.
Well, you might think you are changing the strings. :)

Seriously though, it is practically always the case that when you are
writing an assignment to a reference, you're replacing the reference held
by the variable. The obvious exception is of course where for whatever
reason your code sometimes assigns a reference to a variable when it
already holds that reference. Another example, one that I hope is
_extremely_ uncommon, would be in the case of an operator that's
overloaded to return the original instance reference for one of the
operands.

But as a general rule of thumb, changing an instance will go through some
member of the instance, while changing a reference value will involve the
assignment operator. The string class has no member of the class that
allows the instance to be changed, which is what makes it immutable. But
the "changing a reference value" is the same for the string class as it is
for any other class, immutable or not.

I agree. A reference is a reference.

[...]
I think you lost me. I'm not sure what "Class Object" refers to (maybe
something in the previously referenced web page?), but assuming it's a
class, it's a reference type and assigning the reference to the instance
to any variable just copies the reference, not the class instance itself.

I was just making the point that a value type gets copied from the stack to
the heap. But if the value type is in an object and gets boxed there, it is
already on the heap (since it is part of the object) and it is actually
getting copied from the heap to another location on the heap. Not really
important just that fact that a value type is not always on the stack (if it
is part of an object). That wasn't made clear in either of the articles.
If that's what you meant, then I'll agree with that. :)
[...]
The current value of "i" is passed into the constructor, where it's
copied
to the private field "_value". You can read it back out, which then
copies the value from the private field to wherever you assign it. But
"i"
is only relevant when the instance is first constructed, and there only
as
it's passed by value to the constructor.

So in this case I have "i" which is a value Type and boxed.i (if i were
public) which are completely different even if they have the same value
at
the start. If I then change "i" to 30, boxed.i will still = 5.
Exactly.
[...]
In that case, your statement that "B will be 'something'" is not true.
B
continues to reference the same instance as A, and having changed that
instance so that it contains "something else" instead, the instance of
Mutable that B refers to will contain "something else" (because it's the
same instance).
I got it.

If the string is inside the object and A and B both pointed at the
object
even if the string changed INSIDE the object, since A and B are both
pointing at the same object with the string changed and therefore
A.String
and B.String would be the same.
Right.

But in the above case:

string A = "something";
object B;
B = A;
A = "Something Else"

You really do have 2 references to 2 different objects (a string and and
an
object) and NOT 2 references to the same object.

Well, the _variable_ type is object for the second reference, but after
the third line ("B = A") you still only have one object: the original
string ("something").

Using the variable B you could only do things with the instance that are
defined in the class Object. But the instance is still a string.

After the fourth line, you now have two objects, both of them strings.
The original string is now only accessible via a variable typed as Object,
so you can only use the things defined in the Object type. But you could
cast the reference held in B back to a string, and you'd then be able to
do all the things the class String defines with the instance.
And because of the way a
string works when you change A you actually get a 3rd object

No, only a second object. See above.

You're right.

The first string won't go away because B is pointing at it, so we only have
2 objects.
[...]
Sort of. I think I explained this above, but just to reiterate:

Knowing that the String class is immutable, you know that if two or more
variables do reference the same instance, that that instance will never
change.

In that respect it's not correct to write "any change to the string",
since there's no such thing. A string is immutable; there can be no
change to the string.

I think it is just the language here. I agree that you can't change the
string. I was just saying that if you were trying to make a change to the
string (any change) would just create another string/object with the result.
Even though it looks like the string changed - it actually didn't you are
now pointing at a new location with the changed string.

Tom
 
P

Peter Duniho

[...]
StringType lastName = new StringType();
StringType email2 = new StringType();
lastName.Data = (object)dbReader["FirstName"];
email2.Data = (object)dbReader["Email2"];

At first I tried to cast them as string:
lastName.Data = (string)dbReader["FirstName"];
email2.Data = (string)dbReader["Email2"];

And that works fine, unless the value is null. In that case I get an error:

"Specified cast is not valid".

Not sure why that is since a string can be null.

Based on your later discussion of this issue, it appears to me that
when you write "the value is null", it's not really null.

In spite of its name, an instance of the DBNull type is not actually a
null as far as C# is concerned. It's an actual instance of an object,
and in this case one that's incompatible with the String type. You can
cast it to Object, because you can cast anything to Object. In fact,
in the code you wrote above you don't even need the cast. You can
assign anything to Object without casting.

The other thing you should be careful about is the type returned by
"dbReader[]" more generally. I don't know what type that is (I do very
little database work, so forgive me if it's something obvious :) ), but
presumably it's always returning a specific type, and probably that
type is Object. This means that when value types are returned, they
are already boxed.

This doesn't really affect how you'd use it -- even if you assigned an
unboxed value type to the Data property, it will be boxed for you in
the conversion to Object. But I think it's a good thing to be aware of
what's exactly going on.
string stemp;
stemp = null;

Works fine. So why would (string)dbReader["Email2"] not work?

Because the value being cast isn't "null", it's a reference to an
instance of "DBNull".
Casting it as (object) works whether it is a null or not.

That's because "DBNull" is an instance of type Object, but it's not an
instance of type String.
[...]
Once I saw that I realized I would need to change the line before call to
the ValidateType function to:

if ((value != null) && (value != DBNull.Value))
{
_ValidateType(value);
}
and that fixes that.

Well, it fixes the validation error. Whether it's really what you want
to do, I'm not sure. For one, unless you change the IsNull property,
you're not going to deal with value types correctly. Presumably, you'd
be checking IsNull before assigning an instance of DataType, but in
this case the data will be non-null (even if it's an instance of
DBNull). But casting DBNull in the case of a value type will throw an
exception, just like as if you'd tried to cast an actual null reference
to a value type.

Personally, I would fix it by converting DBNull's to actual nulls:

if (value is DBNull)
{
value = null;
}
if (value != null)
...

(Note: I can't recall off the top of my head whether you can actually
assign to the implicit "value" parameter; seems like you should be able
to but if not, you can just copy it to a local variable in the setter
and use that local variable instead of "value")
I also have to look at the TypedData properties also as they also give me
the Invalid Cast for it also - which if you remember is:

get { return (string)Data; }

Right. Same problem...you don't actually have a null value, you've got
a reference to a DBNull instance.
What I meant was that the statement creates a new reference "o" that will
point to the same location as "str" which is where the string actually is.

Well, I would say that the statement only "creates a new reference" as
far as the compiler is concerned. That is, the reference variable
exists in the compiled code, as a local variable. The variable itself
is created as soon as you call the method containing it. Executing the
assignment simply copies a reference from one variable to another. It
doesn't really "create" anything.
Right.

But what I was saying that it isn't the same as the string since if you
change the string in any way "another" object is created and then one of the
references would be pointing at the original string and the other reference
would be pointing at the new string.

But writing "if you change the string" isn't a precise way of
describing it. You can't change a string.

Again, I realize this is a subtle point, but IMHO it's important. You
may be conceptualizing the act of assigning a new string instance to
your variable as "changing the string", but that's not what you're
doing. You can't change a string, you can only replace a reference to
one string with a reference to another. The original string remains
unchanged.
Right, but that is because YOU are creating a new object.

You are creating a new object in the string example as well. Granted,
the object is created as part of the applicaton's pool of string
constants, rather than being instantiated with the new operator. But
you are still creating a new object in that case too.
In the string
case, the new object is being created because you changed the string.

You didn't change the string. The original string remains unchanged.
You've simply replaced a reference to the original string with a
reference to a different string.
In a
non-string object, this would not be the case. If you said B.something =
15, both A and B would still point to the same object.

But note that in that example you're not assigning to B. You're
assigning to B.something. If there was an assignable "something"
property on the String class, you could do the same with a String.
Conversely, if you change "B.something" to "B", then assuming the
assignment is legal at all (implicit conversion, casting, etc.) you
would be replacing the _reference_ stored by the variable B rather than
changing the original instance B referenced, just like in the string
case.
Right.

You are talking about a value type, I assume. Since you will have a value
type variable and a reference type variable.

No, the above discussion is entirely with respect to reference types.
[...]
I think you lost me. I'm not sure what "Class Object" refers to (maybe
something in the previously referenced web page?), but assuming it's a
class, it's a reference type and assigning the reference to the instance
to any variable just copies the reference, not the class instance itself.

I was just making the point that a value type gets copied from the stack to
the heap. But if the value type is in an object and gets boxed there, it is
already on the heap (since it is part of the object) and it is actually
getting copied from the heap to another location on the heap.

Well, a couple of points here:

First, if the value type is stored in an object, that doesn't mean it's
boxed. It's not. A value type is only boxed when it's _assigned_ to
an object reference variable. If it's _in_ an object, it's just a
plain value type.

So in that case, yes...the value when boxed is copied from one place in
the heap to another. Just as if it's copied from one class instance to
another would cause it to be copied from one place in the heap to
another.

Looking at your statement another way, if you really do want to talk
about copying an already-boxed value type to some other object
reference variable, then: if a value type is already boxed, then when
assigning that boxed instance to another object variable, no copying is
done. Once it's a reference type, the reference can be assigned at
will, with full reference type semantics. That's one of the advantages
of boxing; it really does create a reference type instance that behaves
just like any other reference type instance.

Boxing the value type is expensive, but once it's done, you get all of
the advantages of any other reference type, including being able to
assign references to other variables without having to copy all of the
data.
Not really
important just that fact that a value type is not always on the stack (if it
is part of an object). That wasn't made clear in either of the articles.

I think the second article (the slides) almost got there, but yes...I'd
agree neither was particularly clear on the matter.
[...]
But the string is a special type of reference type (immutable) in that
you
can't just assign the reference to another string and assume they will
stay
the same, since any change to the string will now change the object the
reference is pointing to. Correct?

Sort of. I think I explained this above, but just to reiterate:

Knowing that the String class is immutable, you know that if two or more
variables do reference the same instance, that that instance will never
change.

In that respect it's not correct to write "any change to the string",
since there's no such thing. A string is immutable; there can be no
change to the string.

I think it is just the language here. I agree that you can't change the
string. I was just saying that if you were trying to make a change to the
string (any change) would just create another string/object with the result.
Even though it looks like the string changed - it actually didn't you are
now pointing at a new location with the changed string.

I guess my point is that long-term I think you'll have more success if
you can think more literally about what's going on when you assign a
new string reference to a string variable.

Don't think of it as "trying to make a change". If you were trying to
make a change, you'd call some method, or assign some property, in the
String class that was intended to change the instance itself. Of
course, no such method or property exists on the String class; that's
what makes it immutable. But you never "make a change" by assigning a
reference type. The only thing that can happen when assigning a
reference type is to change the reference itself.

So when you assign a string instance to a string variable, you aren't
"trying to make a change" and it could lead to problems conceptualizing
things later if you continue to think of the assignment as somehow
different from assignments of any other reference type. Assignments of
reference type variables never change the instance; you could never
"make a change" to a reference type instance with a simple assignment
to the reference type variable itself, and it's misleading to think of
doing so with a String type variable in that way.

The important thing here is this: it's practically impossible for a
human being to not be affected by the language that they use to
describe things. You may tell yourself "well, I'm saying it this way,
but I really mean something else". But as much as you may get away
with it some of the time, for most people it's impossible to get away
with it all of the time. Eventually, the way you describe the
operation is going to color your perception of what's actually going
on, and if you describe the operation imprecisely, your perception of
what's going on will be flawed and possibly lead to problems.

What kind of problems? Well, they would possibly include just having
difficulty getting a design implemented, or they could include some
actual bug. Of course, bugs caused by fundamental perception problems
are among the hardest to solve, because the code _looks_ fine to you.

Anyway, that's a very long way of saying I don't think you should ever
think about "changing a string" in .NET. Strings are immutable and
can't be changed. You also shouldn't think that assigning a reference
to a string instance to a String type variable is a way to "try to
change a string". It's not; it's a way to change the _reference_ to a
string. It would never change the string itself, whether the class was
mutable or not.

Any code you could write that _is_ actually "trying to change a string"
just won't compile. Thinking that assigning a new string instance is
an example of "trying to change a string" could lead to perception
problems with respect to the use of the String class, or it could in
fact lead to perception problems with respect to other reference types.
Either way, it can lead to problems. :)

Pete
 
T

tshad

Peter Duniho said:
[...]
StringType lastName = new StringType();
StringType email2 = new StringType();
lastName.Data = (object)dbReader["FirstName"];
email2.Data = (object)dbReader["Email2"];

At first I tried to cast them as string:
lastName.Data = (string)dbReader["FirstName"];
email2.Data = (string)dbReader["Email2"];

And that works fine, unless the value is null. In that case I get an
error:

"Specified cast is not valid".

Not sure why that is since a string can be null.

Based on your later discussion of this issue, it appears to me that when
you write "the value is null", it's not really null.

In spite of its name, an instance of the DBNull type is not actually a
null as far as C# is concerned. It's an actual instance of an object, and
in this case one that's incompatible with the String type. You can cast
it to Object, because you can cast anything to Object. In fact, in the
code you wrote above you don't even need the cast. You can assign
anything to Object without casting.

You're right. I don't need the (object) casting. And DBNull is really
System.DBNull which is different than null.
The other thing you should be careful about is the type returned by
"dbReader[]" more generally. I don't know what type that is (I do very
little database work, so forgive me if it's something obvious :) ), but
presumably it's always returning a specific type, and probably that type
is Object. This means that when value types are returned, they are
already boxed.

That is probably true. If you normally do:

string stemp;
stemp = dbReader["Name"];

You will get an compiler error that says:

Cannot implicitly convert type 'object' to 'string.

You have to:

stemp = (string)dbReader["Name"];
[...]
Once I saw that I realized I would need to change the line before call to
the ValidateType function to:

if ((value != null) && (value != DBNull.Value))
{
_ValidateType(value);
}
and that fixes that.

Well, it fixes the validation error. Whether it's really what you want to
do, I'm not sure. For one, unless you change the IsNull property, you're
not going to deal with value types correctly. Presumably, you'd be
checking IsNull before assigning an instance of DataType, but in this case
the data will be non-null (even if it's an instance of DBNull). But
casting DBNull in the case of a value type will throw an exception, just
like as if you'd tried to cast an actual null reference to a value type.

Personally, I would fix it by converting DBNull's to actual nulls:

if (value is DBNull)
{
value = null;
}
if (value != null)
...
I agree. Then if the object is null, I can pass back DBNull.Value, since
this is mainly going to be used in a database scenario and DBNull.Value is
what is required for sending a null.

[...]
Well, I would say that the statement only "creates a new reference" as far
as the compiler is concerned. That is, the reference variable exists in
the compiled code, as a local variable. The variable itself is created as
soon as you call the method containing it. Executing the assignment
simply copies a reference from one variable to another. It doesn't really
"create" anything.

Actually, with the following:

object o = str;

Doesn't that just create another reference/pointer that points to the same
string object? So that we would now have 2 pointers (o and str) and 1
object that they both point to?

If I were to only do an "object o;", then I assume a 32 bit portion memory
is set aside that points at nothing until you use it.
But writing "if you change the string" isn't a precise way of describing
it. You can't change a string.

Again, I realize this is a subtle point, but IMHO it's important. You may
be conceptualizing the act of assigning a new string instance to your
variable as "changing the string", but that's not what you're doing. You
can't change a string, you can only replace a reference to one string with
a reference to another. The original string remains unchanged.

I agree.

[...]
I guess my point is that long-term I think you'll have more success if you
can think more literally about what's going on when you assign a new
string reference to a string variable.

Don't think of it as "trying to make a change". If you were trying to
make a change, you'd call some method, or assign some property, in the
String class that was intended to change the instance itself. Of course,
no such method or property exists on the String class; that's what makes
it immutable. But you never "make a change" by assigning a reference
type. The only thing that can happen when assigning a reference type is
to change the reference itself.

So when you assign a string instance to a string variable, you aren't
"trying to make a change" and it could lead to problems conceptualizing
things later if you continue to think of the assignment as somehow
different from assignments of any other reference type. Assignments of
reference type variables never change the instance; you could never "make
a change" to a reference type instance with a simple assignment to the
reference type variable itself, and it's misleading to think of doing so
with a String type variable in that way.

The important thing here is this: it's practically impossible for a human
being to not be affected by the language that they use to describe things.
You may tell yourself "well, I'm saying it this way, but I really mean
something else". But as much as you may get away with it some of the
time, for most people it's impossible to get away with it all of the time.
Eventually, the way you describe the operation is going to color your
perception of what's actually going on, and if you describe the operation
imprecisely, your perception of what's going on will be flawed and
possibly lead to problems.

Probably true.

Thanks,

tom
 
P

Peter Duniho

[...]
Actually, with the following:

object o = str;

Doesn't that just create another reference/pointer that points to the same
string object?

Well, again, that comes down to your definition of "create". It's not
how I'd use the word, but I accept an alternative interpretation in
which you can use "create" to describe what's going on.
So that we would now have 2 pointers (o and str) and 1
object that they both point to?

That is true, and probably the most important part.
If I were to only do an "object o;", then I assume a 32 bit portion memory
is set aside that points at nothing until you use it.

Yes. And to me, that's not "creating" a pointer, nor is the assignment
"creating" a pointer. The pointer is created when you create the
object. The pointer is copied when that pointer is assigned to some
variable.

Declaring the variable "creates" a variable that can hold a reference.
Assigning the variable copies a reference to the location that was
created. But the reference itself was created already, and adding new
variables that reference the same instance doesn't create new
references, it just copies the existing one.

At least, that's how _I_ think about it. As I said, I don't find this
is something that _must_ be thought of that way. As long as you're
clear on the mechanics, I think this is a case where you can use
whatever terminology feels most comfortable to you.

Pete
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top