How do most programmers protect their internal data structures?

J

James A. Fortune

The next question I have about the "Effective C#" books is with regard
to Public variables and Class exposure. It was unsettling enough for
me to learn that Refection could be used to change read-only
variables.

In book 1, pp. 137-138:

[Item 23: Avoid Returning References to Internal Class Objects]

You'd like to think that a read-only property is read-only and that
callers can't modify it. Unfortunately, that's not the way it works.
If you create a property that creates a reference type, the caller can
access any public member of the object.
....
You created properties to hide your internal data structures. You
provided methods to let clients manipulate the data only through known
methods, so your class can manage any changes to internal state. And
then a read-only property opens a gaping hole in your class
encapsulation.
....
You've got four different strategies for protecting your internal data
structures from unintended modifications: value types, immutable
types, interfaces, and wrappers.

Elsewhere in the book, he seems to promote using immutable types
wherever possible, but what do most programmers do to protect their
internal data structure? Note: I've never written a C# API
personally.

James A. Fortune
(e-mail address removed)

All data members should be private, without exception. -- Bill Wagner
 
A

Arne Vajhøj

The next question I have about the "Effective C#" books is with regard
to Public variables and Class exposure. It was unsettling enough for
me to learn that Refection could be used to change read-only
variables.

In book 1, pp. 137-138:

[Item 23: Avoid Returning References to Internal Class Objects]

You'd like to think that a read-only property is read-only and that
callers can't modify it. Unfortunately, that's not the way it works.
If you create a property that creates a reference type, the caller can
access any public member of the object.
...
You created properties to hide your internal data structures. You
provided methods to let clients manipulate the data only through known
methods, so your class can manage any changes to internal state. And
then a read-only property opens a gaping hole in your class
encapsulation.
...
You've got four different strategies for protecting your internal data
structures from unintended modifications: value types, immutable
types, interfaces, and wrappers.

Elsewhere in the book, he seems to promote using immutable types
wherever possible, but what do most programmers do to protect their
internal data structure?

A mix of:

get simple types (int, double etc. are value types, string is immutable)
get custom classes that are either immutable or only mutable via real
methods
get readonly collection of such custom classes (readonly is usual done
via a wrapper)

Arne
 
J

James A. Fortune

The next question I have about the "Effective C#" books is with regard
to Public variables and Class exposure.  It was unsettling enough for
me to learn that Refection could be used to change read-only
variables.

Actually, if I recall correctly, the CLR protects read-only variables
(i.e. fields).  Read-only _properties_ on the other hand, those you can
modify via reflection, but only if they are read-only by virtue of a
non-public setter.  If no setter exists at all, it could be fairly
tricky to write reliable code that would in fact modify the property value.
In book 1, pp. 137-138:
[Item 23: Avoid Returning References to Internal Class Objects]
You'd like to think that a read-only property is read-only and that
callers can't modify it.  Unfortunately, that's not the way it works.
If you create a property that creates a reference type, the caller can
access any public member of the object.

IMHO that's a misleading paragraph.  No one has ever even made a claim
that a read-only property is equivalent to the value being returned by
the property itself being immutable.

Furthermore, it overstates the problem, because simply being able to
access public members of an object doesn't even imply that one can
mutate the object.  If the reference type is immutable, then it can't be
changed, by definition.

It's important to understand the difference between variables and the
reference type objects to which they refer.  They are most definitely
not the same, and someone assuming they are is going to run into all
sorts of problems, the read-only aspect being the least of those issues.
...
You created properties to hide your internal data structures.  You
provided methods to let clients manipulate the data only through known
methods, so your class can manage any changes to internal state.  And
then a read-only property opens a gaping hole in your class
encapsulation.

Baloney.  That author is just spreading FUD.

In fact, it's not uncommon at all for a read-only property to
_intentionally_ return a mutable object for the express purpose of
allowing the caller to manipulate a certain kind of state through that
object.

A class that wants to return a mutable object from a read-only property
but without allowing the caller to change that class's state through
that object can (and should) simply return a copy of the relevant
object.  In most cases, this is actually not necessary, but when it is
it's no big deal to do correctly.

There's no "gaping hole".  It's simply a matter of understanding how to
best design your class.
...
You've got four different strategies for protecting your internal data
structures from unintended modifications: value types, immutable
types, interfaces, and wrappers.
Elsewhere in the book, he seems to promote using immutable types
wherever possible, but what do most programmers do to protect their
internal data structure?  Note: I've never written a C# API
personally.

Programmers just do the usual things to protect their internal data
structure.  I.e. they encapsulate functionality and hide information as
necessary.  There a number of useful strategies, all of which boil down
to not offering access to data or behaviors that should be private to
the class.

Immutable types are certainly a potential part of this strategy.  Value
types should, ideally, always be immutable anyway and of course even
when they are not, the original data can only be modify by
re-assignment.  Immutable reference types give value type-like semantics
without the overhead of having to copy all the data all the time.  So
certainly those are two strategies one can apply when desiring to
restrict a caller's ability to modify an object.

But there's no magic bullet.  OOP programming languages provide features
such as member accessibility and const/read-only modifiers, and it's up
to the programmer to use those features to achieve the goals of proper
abstraction, encapsulation, and information hiding.

Pete

I'll take your advice and concentrate on general C# programming
constructs before worrying how others are going to use my code. The
"Effective C#" books brought up a number of interesting topics to
think about. Thanks for providing balance to his security
recommendations. Thanks to Arne also.

Future Directions:
I have a desire to go back and use C# to take advantage of the .NET
Framework from within Access. Therefore, I have a special interest in
features such as Nullable Collections, Optional Parameters, Late
Binding, and Parallel Threads. I don't want to limit myself only to
learning C# techniques that will gentrify Access since C# seems so
handy in its own right :). I also need to be ready to streamline SAP
where I work and Access isn't as good as C# at hiding SQL. WPF will
provide a nice way to choose and display resulting "what-if" scenarios
graphically. I am convinced that C# will be a central focus of
Microsoft based software technologies for many years. I will keep
pounding at it hard. C# terminology is becoming more automatic now,
mostly due to reading this NG and following up on links. It's time
for me to roll up my sleeves and do some coding.

James A. Fortune
(e-mail address removed)
 
A

Arne Vajhøj

I have a desire to go back and use C# to take advantage of the .NET
Framework from within Access. Therefore, I have a special interest in
features such as Nullable Collections, Optional Parameters, Late
Binding, and Parallel Threads. I don't want to limit myself only to
learning C# techniques that will gentrify Access since C# seems so
handy in its own right :). I also need to be ready to streamline SAP
where I work and Access isn't as good as C# at hiding SQL.

It is not the language in itself. But some of the .NET stuff
like EF (and NHibernate if we count non-MS stuff for .NET as well)
are not just 10-15 years newer than ADO in the calendar but also
in the abstraction level.

Regarding SAP then it is more Java oriented than .NET oriented.
WPF will
provide a nice way to choose and display resulting "what-if" scenarios
graphically. I am convinced that C# will be a central focus of
Microsoft based software technologies for many years.

That is MS direction.

And even if they decided to change direction (and that has a
probability that is very close to zero) then it would
take 15-25 years before it was completed.
I will keep
pounding at it hard. C# terminology is becoming more automatic now,
mostly due to reading this NG and following up on links. It's time
for me to roll up my sleeves and do some coding.

That is the way to go.

Arne
 
A

Arne Vajhøj

Actually, if I recall correctly, the CLR protects read-only variables
(i.e. fields).

It does not.

You can change a readonly field.
Read-only _properties_ on the other hand, those you can
modify via reflection, but only if they are read-only by virtue of a
non-public setter. If no setter exists at all, it could be fairly tricky
to write reliable code that would in fact modify the property value.

There are not even guaranteed to be a value to change.

DateTime.Now is a bit difficult to change.
In book 1, pp. 137-138:

[Item 23: Avoid Returning References to Internal Class Objects]

You'd like to think that a read-only property is read-only and that
callers can't modify it. Unfortunately, that's not the way it works.
If you create a property that creates a reference type, the caller can
access any public member of the object.

IMHO that's a misleading paragraph. No one has ever even made a claim
that a read-only property is equivalent to the value being returned by
the property itself being immutable.

No one that knows C# (or for that matter Java or C++).

But I would expect many beginners learning programming be tricked.
Baloney. That author is just spreading FUD.

In fact, it's not uncommon at all for a read-only property to
_intentionally_ return a mutable object for the express purpose of
allowing the caller to manipulate a certain kind of state through that
object.

A class that wants to return a mutable object from a read-only property
but without allowing the caller to change that class's state through
that object can (and should) simply return a copy of the relevant
object. In most cases, this is actually not necessary, but when it is
it's no big deal to do correctly.

There's no "gaping hole". It's simply a matter of understanding how to
best design your class.

I agree that it is often intended.

But it also happens frequently that it is not intended.

Arne
 
J

James A. Fortune

[...]
Actually, if I recall correctly, the CLR protects read-only variables
(i.e. fields).
It does not.
You can change a readonly field.

Sorry, I was not clear.  I agree that there are scenarios in which even
a readonly field can be changed.  But it's important to note that these
are a bit more "hacky" than just using reflection.

My point is that you can't simply get a FieldInfo and go writing over
the field.  It's not true that "Refection [sic] could be used to change
read-only variables."

LOL. Refection is a real word meaning nourishment. That's why the
spell checker didn't catch it. While I'm here, I agree that there's
likely to be a much greater effort required to change a read-only
property if it doesn't have a setter versus modifying a read-only
attribute. It's useful to know that such gymnastics are possible, but
it's nothing I need to worry about for now. This thread happened to
cover attribute usage anyway.

Still, I gleaned a lot of useful information from this thread. I
looked at EF a few months ago and the potential for eliminating joins
certainly caught my attention. I'm still a relative beginner at C#,
but I've done a lot of programming in my life and pick up new ideas
quickly. A degree in applied mathematics and two engineering degrees
help keep the number of conceptual surprises to a minimum. Plus, I
push myself harder than any boss can. I.e., don't expect me to be a
relative beginner for very long. I'm impressed by the level of C#
expertise in this NG and respect the amount of effort it took to
obtain it; but I have no doubt about the amount of resolve I have and
the results I expect to obtain based on that resolve. Plus, once I
obtain the results I want after much hard work, I won't forget the
help I received here, either from questions I posted or from simply
lurking. In the Access newsgroups, I feel like I gave back as much as
I got. Perhaps that will happen here eventually. I have my own
reasons for preferring to use the newsgroups.

James A. Fortune
(e-mail address removed)
 
R

Registered User

[...]
Actually, if I recall correctly, the CLR protects read-only variables
(i.e. fields).

It does not.

You can change a readonly field.

Sorry, I was not clear. I agree that there are scenarios in which even
a readonly field can be changed. But it's important to note that these
are a bit more "hacky" than just using reflection.

My point is that you can't simply get a FieldInfo and go writing over
the field. It's not true that "Refection [sic] could be used to change
read-only variables."

For example, in the code below, only "_field2" is modified:
Are you certain of this? The code I copied and pasted changes both
values.

regards
A.G.
using System;
using System.Reflection;

namespace TestWriteReadOnlyField
{
class Program
{
static readonly int _field1 = 17;
static int _field2 = 19;

static void Main(string[] args)
{
SetFieldValue("_field1", 31);
SetFieldValue("_field2", 37);

Console.WriteLine("_field1: {0}", _field1);
Console.WriteLine("_field2: {0}", _field2);
}

static void SetFieldValue(string name, object value)
{
FieldInfo fi = typeof(Program).GetField(name,
BindingFlags.Static | BindingFlags.NonPublic);

fi.SetValue(null, value);
}
}
}
 
A

Arne Vajhøj

[...]
Actually, if I recall correctly, the CLR protects read-only variables
(i.e. fields).

It does not.

You can change a readonly field.

Sorry, I was not clear. I agree that there are scenarios in which even a
readonly field can be changed. But it's important to note that these are
a bit more "hacky" than just using reflection.

My point is that you can't simply get a FieldInfo and go writing over
the field. It's not true that "Refection [sic] could be used to change
read-only variables."

For example, in the code below, only "_field2" is modified:

using System;
using System.Reflection;

namespace TestWriteReadOnlyField
{
class Program
{
static readonly int _field1 = 17;
static int _field2 = 19;

static void Main(string[] args)
{
SetFieldValue("_field1", 31);
SetFieldValue("_field2", 37);

Console.WriteLine("_field1: {0}", _field1);
Console.WriteLine("_field2: {0}", _field2);
}

static void SetFieldValue(string name, object value)
{
FieldInfo fi = typeof(Program).GetField(name, BindingFlags.Static |
BindingFlags.NonPublic);

fi.SetValue(null, value);
}
}
}

Try make the fields non static !

Arne
 
A

Arne Vajhøj

[...]
Actually, if I recall correctly, the CLR protects read-only variables
(i.e. fields).

It does not.

You can change a readonly field.

Sorry, I was not clear. I agree that there are scenarios in which even
a readonly field can be changed. But it's important to note that these
are a bit more "hacky" than just using reflection.

My point is that you can't simply get a FieldInfo and go writing over
the field. It's not true that "Refection [sic] could be used to change
read-only variables."

For example, in the code below, only "_field2" is modified:
Are you certain of this? The code I copied and pasted changes both
values.

Not here.

It silently ignores the attempt to modify _field1.

But make it non static and it does get changed.

Arne
 
A

Arne Vajhøj

On 2/20/11 10:02 AM, Arne Vajhøj wrote:
[...]
Actually, if I recall correctly, the CLR protects read-only variables
(i.e. fields).

It does not.

You can change a readonly field.

Sorry, I was not clear. I agree that there are scenarios in which even
a readonly field can be changed. But it's important to note that these
are a bit more "hacky" than just using reflection.

My point is that you can't simply get a FieldInfo and go writing over
the field. It's not true that "Refection [sic] could be used to change
read-only variables."

For example, in the code below, only "_field2" is modified:
Are you certain of this? The code I copied and pasted changes both
values.

Not here.

It silently ignores the attempt to modify _field1.

And I would argue that silently ignoring it is not much better
than changing the value.

Arne
 
R

Registered User

[...]
For example, in the code below, only "_field2" is modified:
Are you certain of this? The code I copied and pasted changes both
values.

I'm certain. What version of .NET are you using?

I'm using the code below and VS2008.

regards
A.G.

using System;
using System.Reflection;

namespace ThrowAwayConsole
{
class Program
{
static readonly int _field1 = 17;
static int _field2 = 19;

static void Main(string[] args)
{
Console.WriteLine("_field1: {0}", _field1);
Console.WriteLine("_field2: {0}", _field2);

SetFieldValue("_field1", 31);
SetFieldValue("_field2", 37);

Console.WriteLine("_field1: {0}", _field1);
Console.WriteLine("_field2: {0}", _field2);

Console.WriteLine(" - done -");
Console.ReadLine();
}

static void SetFieldValue(string name, object value)
{
FieldInfo fi = typeof(Program).GetField(name,
BindingFlags.Static | BindingFlags.NonPublic);
fi.SetValue(null, value);
}
}
}
 
A

Arne Vajhøj

[...]
For example, in the code below, only "_field2" is modified:

Are you certain of this? The code I copied and pasted changes both
values.

I'm certain. What version of .NET are you using?

I'm using the code below and VS2008.

regards
A.G.

using System;
using System.Reflection;

namespace ThrowAwayConsole
{
class Program
{
static readonly int _field1 = 17;
static int _field2 = 19;

static void Main(string[] args)
{
Console.WriteLine("_field1: {0}", _field1);
Console.WriteLine("_field2: {0}", _field2);

SetFieldValue("_field1", 31);
SetFieldValue("_field2", 37);

Console.WriteLine("_field1: {0}", _field1);
Console.WriteLine("_field2: {0}", _field2);

Console.WriteLine(" - done -");
Console.ReadLine();
}

static void SetFieldValue(string name, object value)
{
FieldInfo fi = typeof(Program).GetField(name,
BindingFlags.Static | BindingFlags.NonPublic);
fi.SetValue(null, value);
}
}
}
And it writes:

_field1: 17
_field2: 19
_field1: 31
_field2: 37
- done -

not:

_field1: 17
_field2: 19
_field1: 17
_field2: 37
- done -

?

Arne
 
A

Arne Vajhøj

Interesting. It never occurred to me to try the test with instance
fields instead of static (since it's not like it's code I'd ever be
likely to write for real, I've only ever looked at the most expedient
way to write the test).

I don't have time to look at the CLR spec right now. The MSDN docs do
imply that reflection should in fact work, which would make the failure
on static fields a bug. The documentation for FieldInfo.SetValue
suggests that as long as the code has the
SecurityPermissionFlag.SerializationFormatter permission that readonly
fields (aka "init-only" in CLR parlance) are in fact legal to modify
through reflection.

(The flag name hinting, of course, at one legitimate scenario for
allowing this — to initialize readonly fields after an object's been
constructed for the purpose of deserialization — and also why it might
not work for static fields, since deserialization doesn't apply to
static members).

I suppose at some point someone ought to look through the CLR spec and
see what it says about it.

I doubt that the API doc are correct in implying that it should work.

The CLI spec says:

<quote>
[Note: The use of ldflda or ldsflda on an initonly field makes code
unverifiable. In unverifiable code, the
VES need not check whether initonly fields are mutated outside the
constructors. The VES need not report
any errors if a method changes the value of a constant. However, such
code is not valid. end note]
</quote>

Which I believe translates to "It is really not compliant code,
but the runtime will not detect it".

Non compliant must mean that the behavior is really undefined.

When the question came up at SO here:

http://stackoverflow.com/questions/...-private-readonly-field-in-c-using-reflection

the Eric Lippert wrote:

<quote>
I note also that just because you can in some implementation today does
not mean that you can on every implementation for all time. I am not
aware of any place where we document that readonly fields must be
mutable via reflection. As far as I know, a conforming implementation of
the CLI is perfectly free to implement readonly fields such that they
throw exceptions when mutated via reflection after the constructor is done.
</quote>

which indicates the same.

And he is an insider at MS.

Arne
 
A

Arne Vajhøj

Interesting. It never occurred to me to try the test with instance
fields instead of static (since it's not like it's code I'd ever be
likely to write for real, I've only ever looked at the most expedient
way to write the test).

I don't have time to look at the CLR spec right now. The MSDN docs do
imply that reflection should in fact work, which would make the failure
on static fields a bug. The documentation for FieldInfo.SetValue
suggests that as long as the code has the
SecurityPermissionFlag.SerializationFormatter permission that readonly
fields (aka "init-only" in CLR parlance) are in fact legal to modify
through reflection.

(The flag name hinting, of course, at one legitimate scenario for
allowing this — to initialize readonly fields after an object's been
constructed for the purpose of deserialization — and also why it might
not work for static fields, since deserialization doesn't apply to
static members).

I suppose at some point someone ought to look through the CLR spec and
see what it says about it.

I doubt that the API doc are correct in implying that it should work.

The CLI spec says:

<quote>
[Note: The use of ldflda or ldsflda on an initonly field makes code
unverifiable. In unverifiable code, the
VES need not check whether initonly fields are mutated outside the
constructors. The VES need not report
any errors if a method changes the value of a constant. However, such
code is not valid. end note]
</quote>

Which I believe translates to "It is really not compliant code,
but the runtime will not detect it".

Non compliant must mean that the behavior is really undefined.

When the question came up at SO here:

http://stackoverflow.com/questions/...-private-readonly-field-in-c-using-reflection


the Eric Lippert wrote:

<quote>
I note also that just because you can in some implementation today does
not mean that you can on every implementation for all time. I am not
aware of any place where we document that readonly fields must be
mutable via reflection. As far as I know, a conforming implementation of
the CLI is perfectly free to implement readonly fields such that they
throw exceptions when mutated via reflection after the constructor is done.
</quote>

which indicates the same.

And he is an insider at MS.

But the bottom line is that with current implementations
of .NET, then it is possible to change instance readonly fields.

So the book was perfectly right to warn against it.

Arne
 
A

Arne Vajhøj

On 2/20/11 6:44 PM, Registered User wrote:
[...]
For example, in the code below, only "_field2" is modified:

Are you certain of this? The code I copied and pasted changes both
values.

I'm certain. What version of .NET are you using?

I'm using the code below and VS2008.

I'm using VS2010.

I took a look at the CLR spec, and I admit…I'm not really clear on what
the rules are. There is (among other things) this text:

The init-only constraint promises (hence, requires) that once
the location has been initialized, its contents never change.
Namely, the contents are initialized before any access, and
after initialization, no value can be stored in the location.

Doesn't seem that equivocal to me. Fields marked as init-only (i.e. C#
"readonly" fields) aren't supposed to be modifiable after initialization.
Yep.

There is also this:

Must be able to read field initialization metadata for
static literal fields and inline the value specified when
referenced. Consumers can assume that the type of the field
initialization metadata is exactly the same as the type of
the literal field

That basically is saying that for static readonly fields, the compiler
is free to not bother reading the value, but rather to just inline it as
if it were a "const" member instead.

It does not because a readonly field is not a literal field.

Literal fields areconst not readonly.
That is, the call to SetValue() _does_ succeed (and if I had time to
test, I would attempt to confirm that by calling GetValue() right after
to examine the field's value that way), but because the compiler has
optimized (even in the debug build) the readonly field, the code that
looks like it's accessing it actually isn't (so even though the field is
updated, the new value isn't seen by the code executing later).

I tried with both 1.1 and 2.0/3.5 and it behaves like 4.0.
There is some verbiage regarding verifiable vs unverifiable code, which
seems to be saying that a run-time is in fact allowed to create a
situation where the readonly field can be modified, but that doing so
may result in unverifiable code (when the "ldflda" or "ldsflda"
instructions are used, which retrieve the address of the field in
question). Also, the documentation for the "stfld" instruction
specifically calls out the possibility of using that instruction to
modify a readonly field, even going so far as to allow that in
verifiable code.

I think that is the important part.
But I don't see how that squares with the other text that says the field
isn't allowed to change.

The difference must in between what is correct with defined
behavior and what is incorrect with undefined behavior.

Arne
 
A

Arne Vajhøj

[...]
Actually, if I recall correctly, the CLR protects read-only variables
(i.e. fields).
It does not.
You can change a readonly field.

Sorry, I was not clear. I agree that there are scenarios in which even
a readonly field can be changed. But it's important to note that these
are a bit more "hacky" than just using reflection.

My point is that you can't simply get a FieldInfo and go writing over
the field. It's not true that "Refection [sic] could be used to change
read-only variables."

LOL. Refection is a real word meaning nourishment. That's why the
spell checker didn't catch it. While I'm here, I agree that there's
likely to be a much greater effort required to change a read-only
property if it doesn't have a setter versus modifying a read-only
attribute. It's useful to know that such gymnastics are possible, but
it's nothing I need to worry about for now. This thread happened to
cover attribute usage anyway.

As shown then it is as simple as getting the field and
call SetValue to modify an instance readonly field.

Arne
 
A

Arne Vajhøj

I wouldn't use the word "tricked". To me, that implies some active
attempt to deceive. And I would agree that beginners may easily make the
mistake. But that's no reason to compound their misunderstanding by
implying that there's even any good reason to think a read-only property
implies more than just "can't set the value of the property".

The sooner beginners get that thought out of their head, the sooner they
get their mind-set to more properly match the actual semantics and
mechanisms of the language.

Which make it a good point to mention it in a book which
has the purpose of making beginners to non-beginners !?!?
In fact, it's not uncommon at all for a read-only property to
_intentionally_ return a mutable object for the express purpose of
allowing the caller to manipulate a certain kind of state through that
object.
[...]

I agree that it is often intended.

But it also happens frequently that it is not intended.

Huh? You're saying that _frequently_ someone will make a property
read-only with the intent that the reference type object that is
returned is _also_ immutable?

No.

I am saying that they return a mutable object without any
intention of it being modified by the caller.
I hardly even see that here in code written by beginners, and have never
seen it in real-world code. In what context have you observed it to be
_frequent_?

It happens all the time.

Because it is usually to cumbersome to return something
immutable when you have something that is mutable.

Arne
 
R

Registered User

On 2/20/11 6:44 PM, Registered User wrote:
[...]
For example, in the code below, only "_field2" is modified:

Are you certain of this? The code I copied and pasted changes both
values.

I'm certain. What version of .NET are you using?

I'm using the code below and VS2008.

I'm using VS2010.
When I run the same code in VS2010 the field is not modified. I
appreciate the details you have provided below.

regards
A.G.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top