GC:
Problem #2 : Unsafe contexts. There is a problem of being able to modify managed types.
This significantly undermines the effectiveness of a GC and renders all unsafe code as
potentially having classes of errors which are normally removed by a GC mechanism.
From the language specification :
There are certainly issues that can come up from misuse of unsafe code. The
example you cited from the MSDN talks about strings, for example. Strings
are immutable types in the CLI/CLR, which allows the CLR to do
optimizations. For example, assigning one string ref to another doesn't copy
the contents. Since the runtime knows strings shouldn't change, both
references point at the same character array until one is assigned to a
different value. If you use unsafe code with one ref, and change the value,
you are inadvertantly changing another string ref's data as well, which
could cause a series of bugs. However, this in no way affects the GC. The
objects themselves live on the managed heap, and the GC will eventually get
around to cleaning them. I *strongly urge* you to read up on the GC and how
it works internally. Having said that, let's look at a problem with our
string scenario that can't be solved without unsafe or unmanaged code -
let's say you create one string variable and decrypt cipher data into it
containing a password (or any secret bit of info - credit card number,
etc.). You do what you need to do in your function, and when the function
ends, your string variable goes out of scope. The string is now elegible for
collection. However, you have no idea how long it will remain in memory.
It's possible that it's there for some time, and let's say it gets paged
out. Now, a clever hacker can sniff through your page file for secret data,
and it will show up. Programmers who are wary of security concerns need to
wipe secrets out of memory as quickly as possible. Knowing that there is
only ever one reference to the string, there is no reason I can't pin that
string, and using unsafe or unmanaged code, zero it out. You may not like
unsafe code, and it does have the potential of burning developers who don't
know what they're doing, but it does have its legitimate uses. Remember that
you can turn off this feature by not allowing the compiler switch to be used
by your development team. C# will not compile unsafe code by default. You do
not mention that in your text.
Value Type / Reference Types
A type in C# declares its own passing and assigning semantics. This violates a
consistency rule of expected behaviour. Since objects are always reference types
this also leads to several small performance penalties :
a.. An extra word is always needed for the pointer
b.. There are more allocations on the heap than are strictly neccessary
leading to memory fragmentation
c.. There is more memory than strictly neccessary needed to be cleaned up
by the garbage collector
d.. Dereference of pointer penalty
As a result of having two kinds of types and in order to have some kind of unification
C# chose to use a boxing / unboxing semantic. Boxing and unboxing is both subtle and
confusing as it can lead to a surprising interpretation of straightforward
code.
Are you saying that there should only be either all value types or all
reference types? If so, which one would you use? Languages like Java use
exclusively reference types (except for special case primitives) because it
is the most flexible type. However, if you kept all reference types, then
you'd have nothing but types that fit your "perforance penalties". On the
other hand, value types have an enormous number of limitations compared to
reference types, so I can't even fathom a language with all value types.
What's your alternative?
Also, the managed heap doesn't suffer from memory fragmentation like heaps
from certain other languages do. Use of the GC allows all memory to be
allocated sequentially. As objects get collected, the space is compacted by
generation. However, let's pretend for a minute this isn't the case. How
does using a non-reference type change the way the heap is used? You are
still allocating and cleaning out memory, and you'd still be leaving
fragments of unallocated space. I don't quite understand this bullet point,
or the next one. Finally, if you are using a system that has no reference
types, how can you have references that point to the same data? Would you
have to copy all your data EVERY time? How would you synchronize instances
to maintain semantics of assignment? Wouldn't that suck for performance?
Special Primitives and Immutabilty
"a const field is a compile-time constant"
You still haven't answered my question about how you would allow reference
types (or other user-defined types) to have this behavior. This is where we
need to develop the syntax that allows programmers to specify literal values
for what could be relatively complex object graphs, THEN on top of that, the
compiler needs to have a mechanism to store this data in the program AND
reconstitute it at runtime. Then we run into the problem of read-only (which
C# can do) vs. immutability. If you reconstitute the value at runtime by
normal instantiation, you are only creating a read-only value (which C# can
already do). If you reconstitute the value in a truly const manner, you
effectively bypass constructors, which is disasterous. This is why all
languages i can think of don't allow this. If you are suggesting that C# is
doing something that is less than ideal, what is your ideal solution?
C# does not support delegating interface implementations to member classes.
This significantly restricts the ability to use interfaces extensively.
Example? I'm not entirely disagreeing with you, but I'm wondering if we are
exaggerating the problem by using "extensively"?
The reason I'm not disagreeing entirely is that i'm not exactly fond of the
syntax C# uses for implementing interfaces. However I disagree that C# can't
do this. It is perfectly capable of delegating interface method calls to
member classes... I think you mean delegating the method call to a method of
a *contained* class (rather than nested class, right?). Anyway, again, C# is
perfectly capable of doing this.
Missing Interface Extensions
Example please? Are you assuming that you can't define Interfaces in C#?
That's what this is reading like, and that certainly isn't true at all. C#
allows you to write classes, abstract classes, AND stand-alone interfaces.
Interfaces in C# have no implementation and are not tied to a class
definition, so I'm confused at what you mean in your statement. Class
inheritance is NOT the only means of polymorphism in C#. That is also an
incorrect statement. You can certainly implement interfaces and create
interfaces. Please clarify this point, because I don't think you understand
what C# is doing here.
Source File Layout
[The lack of seperate and explicit definitions, aka header files]
promotes an unstructured coding style and often leads to bugs that are not
immediately obvious without reading the automatically generated
documentation.
I'm not sure I agree with this. What type of bugs are we talking about here
and how do we know this happens often? If you make an assertion, you are
going to have to back it up or qualify it. You may *prefer* to have header
files for various reasons, but you shouldn't make a blanket statement about
bugs without having data to prove it, or at the very least giving a good
example (or more) of what types of bugs we're dealing with here. I also
think you are going to have to explain why header files are better in your
opinion than documentation... in a sense, they are both being viewed here as
documentation here, and both are nothing but text at this point.
-Rob Teixeira [MVP]