Automatic vs. explicit initialization of immutable static objects.

M

Marcel Müller

I have seen two implementations to instantiate static objects:


#1 - the easy way

public static class Array<T>
{
public static readonly T[] Empty = new T[0];
}


#2 - using a get property

public static class Array<T>
{
private static T[] _Empty;

public static T[] Empty
{
get
{
if (_Empty == null)
_Empty = new T[0];
return _Empty;
}
}
}


The first implementation is the obvious one that I usually prefer.
But the I have seen the second form too. E.g. the Implementation of
Enumerable.Empty<T>() uses an internal helper class that works like #2.

Is there some serious reason to prefer #2?

The only things I can think of is that using the static class
initializer has side effects in case of exceptions and the existence of
a static class initializer involves locking while the second
implementation is lock free. (With some race condition that does not
harm with the .NET memory model.)


Marcel
 
B

bradbury9

El jueves 1 de marzo de 2012 14:58:43 UTC+1, Marcel Müller escribió:
I have seen two implementations to instantiate static objects:


#1 - the easy way

public static class Array<T>
{
public static readonly T[] Empty = new T[0];
}


#2 - using a get property

public static class Array<T>
{
private static T[] _Empty;

public static T[] Empty
{
get
{
if (_Empty == null)
_Empty = new T[0];
return _Empty;
}
}
}


The first implementation is the obvious one that I usually prefer.
But the I have seen the second form too. E.g. the Implementation of
Enumerable.Empty<T>() uses an internal helper class that works like #2.

Is there some serious reason to prefer #2?

The only things I can think of is that using the static class
initializer has side effects in case of exceptions and the existence of
a static class initializer involves locking while the second
implementation is lock free. (With some race condition that does not
harm with the .NET memory model.)


Marcel

AFAIK the reason of #2 is being able to bind the field to a DataSource.
 
M

Marcel Müller

#2 - using a get property

public static class Array<T>
{
private static T[] _Empty;

public static T[] Empty
{
get
{
if (_Empty == null)
_Empty = new T[0];
return _Empty;
}
}
}


The first implementation is the obvious one that I usually prefer.
But the I have seen the second form too. E.g. the Implementation of
Enumerable.Empty<T>() uses an internal helper class that works like #2.

Is there some serious reason to prefer #2?

The second version is not thread-safe. So the first question that should
be asked is whether that's important or not.

#2 is perfectly thread-safe as long as the requirement is not a singleton.
One advantage of the second version is that it's "lazy". You don't get an
instance until you actually need one. The first version will create an
instance when the type is initialized.

Hm, this is a real difference if the class provides different kind of
services.
Note that since .NET 4, there is the Lazy<T> type which provides for lazy
initialization without having to implement the property directly. For
example:

public static readonly Lazy<T[]> Empty = new Lazy<T[]>(() => new T[0]);

Lazy<T> is thread-safe by default. It does change the code syntax a bit,
since you use the Value property to actually retrieve the value. But where
you actually need lazy initialization I think it's preferable.

Currently I am stuck to .NET 3.5. But, it would be trivial to implement
Beyond that, I think that it's not really possible to say one is always
better than the other. To some extent it really depends on what you're
actually initializing and how it's used later on in the program. Without
knowing those details, it's not possible to answer the question generally.

You already got me to the right point. The main difference is the time
of the initialization and, of course, the singleton aspect.


Marcel
 
M

Marcel Müller

For certain: unless the design is _specifically_ to _intend_ for multiple
instances of the object to be created, I would hardly say it's _perfectly_
thread-safe, even if the program logic is unharmed by the implementation
choice. "Imperfectly thread-safe", maybe.

Well, I think not a design pattern as such can be thread-safe. This is
not sufficient. Only the whole implementation of some logic can be
thread-safe.
Beyond that, you are clearly using a different definition of "thread-safe"
than I would. As I already pointed out, even in the example at hand there
is the potential for problems: if the code that uses this instance depends
on multiple calls to the property returning identical references, the code
as written is not safe.

Yes. But on the other hand, a thread-safe library is often not
sufficient for defined behavior of the application. So I have to care
about thread safety at almost all application levels anyway.

But you are right. My definition of thread-safe is that the application
shows defined behavior. This does not necessarily imply that there are
no code paths with undefined behavior inside the application.

In practice I use the pattern with the potential double initialization
from once in a while for simple cache implementations. The drawback of
calculating a cache content twice from time to time is often
considerably smaller than always fiddling with locks. Namly, the example
with the cache has no reasonable solution with a simple mutex. It needs
something like a condvar. Otherwise the cache needs to be locked as long
as a cache item is calculated. This prohibits any parallelism.

I suppose you could say that's a "requirement for a singleton", and maybe
that's a reasonable way to look at it. But the fact remains. It also
raises the question: if not for singleton-like semantics, why have such a
static instance in the first place?

Mostly for caching of smaller items that are used quite often.
If the object is expensive to create, then surely it is better to have true
singleton semantics. If it's not expensive to create, and it's otherwise
not actually a problem for multiple instances to exist, then why have the
static instance in the first place?

It depends on how often you need to create the temporary. If this is
done in functions called from inner loops, it could be reasonable not
always to write new. Furthermore if the retrieved items become part of a
long lived object, very many objects may propagate to generation 2,
which is normally not a good advise.

Unfortunately, the example you posted is degenerate. It offers no real
insight regarding real-world usage patterns. So it doesn't really help the
discussion much.

Well, thats a common problem of examples that fit into a discussion.
But I suggest that in the real world, static read-only
instances nearly always come with an implied need for thread-safety. A
lack of such a need nearly always will be proof that the static read-only
instance itself is not needed.

You are right, if you talk about "needed for functionality".

I don't see how the readability is affected significantly. However, if it
bothers you that much you can always wrap the Lazy<T> object with a
property. Then Lazy<T> is just a hidden implementation detail, helping you
avoid boilerplate code.

That's right. But then you do no longer save much compared to the
explicit implementation, which is not that bad too.


Marcel
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top