Reference to a reference?

K

kndg

Hi all,

I know that C# does support a pointer to a pointer type but that only
works on a value types. As far as I know, C# does not support a
reference to a reference type (I don't know a correct word but what I
mean here is you cannot store an address to a reference type -- am I
correct?). Hopefully below code example explains what I am trying to do.
I have been searching the internet for a workaround but couldn't find a
good answer. So, I made my own wrapper class...

using System;

public class Reference<T>
{
private Func<T> pointer;

public T Value
{
get { return pointer(); }
}

public Reference(Func<T> pointer)
{
this.pointer = pointer;
}
}

public class Test
{
public static void Main()
{
string str1 = "Hello";
string str2 = "World";

var ref1 = new Reference<string>(() => str1);
var ref2 = new Reference<string>(() => str2);

str1 = "Hi";

Console.WriteLine(ref1.Value);
Console.WriteLine(ref2.Value);
}
}

The above code works, but I'm hoping something that is much more
simpler. The above class takes a lambda as method parameter which has a
drawback (the file size will grow proportionally with the number of
variables used) though it is not something that I'm worried about. I had
tried replacing the method signature to make it pass by reference but
couldn't make it to work.

public Reference(ref T pointer)
{
// do something
}

I know the above code smells a bad design, but I couldn't find an
alternative to my situation. I am working on a data structure
(specifically: a HashSet) that store some objects. Some of these objects
are not initialized during start-up and have the null value. These
object will later initialized based on the user input. If I add these
objects to the HashSet, only the first one will be added, while the rest
are discarded, which is not what I want. I had tried make a wrapper
class for this object but it make the code looks more complicated than
it supposed to be.

Does anyone here have an advice?

Regards.
 
K

kndg

Are you talking about in unsafe code? Is unsafe code an acceptable
condition for your needs?

Currently, this is my free-time project. So, anything is acceptable.
[...]
The above code works, but I'm hoping something that is much more
simpler. The above class takes a lambda as method parameter which has
a drawback (the file size will grow proportionally with the number of
variables used)

What file size are you talking about?

Err, I mean the executable file size. The compiler will generate a
separate method for each lambda so, this will translate to an increase
in file size (though not so much to worry about).
[...]
What's the point of storing a null reference in a HashSet? Even if you
could store references to references in a HashSet, what would the point
of that be? Without more work, the key will be the reference itself, and
of course if you have the reference, you already have the object. You
don't need a HashSet to get the object.

It would be helpful if you would be more specific about the actual
problem you're trying to solve. Your description is hard to follow. It
would be good to see actual code, even if it doesn't compile, to show
what you'd _like_ to do.

I am creating an immutable Set class which use HashSet as its internal
data structure. Current .Net implementation for HashSet is destructive,
so I need for immutable one. Sample code as below (which has been
trimmed considerably to save space),

using System;
using System.Collections;
using System.Collections.Generic;
using System.Linq;
using System.Text;

namespace XYZ.Math
{
public class Set<T> : IEnumerable<T>
{
private readonly HashSet<T> items;
private static readonly Set<T> emptySet = new Set<T>();

public int Count
{
get { return items.Count; }
}

public static Set<T> EmptySet
{
get { return emptySet; }
}

private Set()
{
items = new HashSet<T>();
}

public Set(IEnumerable<T> collection)
: this()
{
if (collection == null)
{
throw new ArgumentNullException();
}

foreach (var item in collection)
{
items.Add(item);
}
}

private bool Add(T item)
{
return items.Add(item);
}

public bool Contains(params T[] elements)
{
return (elements != null) && elements.All(items.Contains);
}

public Set<T> UnionWith(Set<T> other) // (a + b) or (a | b)
{
if (other == null) return this;

var union = new Set<T>(items);

foreach (var item in other)
{
union.Add(item);
}

return union;
}

// union
public static Set<T> operator +(Set<T> a, Set<T> b)
{
return (a == null) ? b.UnionWith(a) : a.UnionWith(b);
}

// adding a single item
public static Set<T> operator +(Set<T> a, T item)
{
return new Set<T>(a) { item };
}

// equality
public static bool operator ==(Set<T> a, Set<T> b)
{
if (ReferenceEquals(a, b)) return true; // both null
if (ReferenceEquals(a, null)) return false;
if (ReferenceEquals(b, null)) return false;

return (a.Count == b.Count) && a.All(item => b.Contains(item));
}

public static bool operator !=(Set<T> a, Set<T> b)
{
return !(a == b);
}

public bool Equals(Set<T> other)
{
return (this == other);
}

public override bool Equals(object obj)
{
if (ReferenceEquals(this, obj)) return true;

return (obj.GetType() == typeof(Set<T>)) && Equals((Set<T>)obj);
}

public override int GetHashCode()
{
return items.GetHashCode();
}

public override string ToString()
{
var sb = new StringBuilder("{ ");

foreach (var item in this)
{
sb.Append(item + " ");
}

sb.Append("}");

return sb.ToString();
}

public IEnumerator<T> GetEnumerator()
{
return items.GetEnumerator();
}

IEnumerator IEnumerable.GetEnumerator()
{
return GetEnumerator();
}
}
}

Sample usage:

public class CustomObject
{
public string Text { get; private set; }

public CustomObject(string text)
{
Text = text;
}
}

public class Program
{
public static void Main()
{
CustomObject obj1 = new CustomObject("a");
CustomObject obj2 = new CustomObject("b");
CustomObject obj3 = null;

var list1 = new CustomObject[] { obj1, obj2, obj3 };

var list2 = new Reference<CustomObject>[]
{
new Reference<CustomObject>(() => obj1),
new Reference<CustomObject>(() => obj2),
new Reference<CustomObject>(() => obj3),
};

var set1 = new Set<CustomObject>(list1);
var set2 = new Set<Reference<CustomObject>>(list2);

obj3 = new CustomObject("hello");

// below will throw NullReferenceException
//foreach (var item in set1)
//{
// Console.WriteLine(item.Text);
//}

// while this one does not because obj3 has been initialized
foreach (var item in set2)
{
Console.WriteLine(item.Value.Text);
}

CustomObject obj4 = null;

var set3 = set1 + obj4;

Console.WriteLine(set3.Count); // still 3: cannot add redundant null

var set4 = set2 + new Reference<CustomObject>(() => obj4);

Console.WriteLine(set4.Count); // becomes 4: redundant null is okay

Console.ReadLine();
}
}

There is situation where late initialization is necessary (due to
expensive object creation, user interaction, etc), so I initially set it
to null, but I know later at some time this object will eventually get
initialized so I want the set to use the initialized object instead -
hence the creation of Reference<T> class. Although I think Reference<T>
class is general purpose enough to get the reference-to-reference (?)
semantic, but its use here is potentially dangerous (you can have the
same multiple objects inside the set which defeat the purpose of HashSet).

Any thought?
 
K

kndg

I'm not sure how the second sentence relates to the first. Just because
it's a "free-time project", it can be written without any regard for any
kind of code reliability, maintainability, readability, etc.?

No, I'm always strive to code with that discipline in mind, though
sometimes when it come to "free-time project", I'm feeling less
restricted. From your statement, I have a feeling that you are saying
that to stay away from unsafe code at all cost...thats not the way to
code...
I also don't understand your comment about "…that only works on a value
type". You didn't answer my first question: are you making that
statement in the context of unsafe code? Or something else?

Yes, in the context of unsafe code. Or can it be done on the reference
type also?
[...]
It would help if you could clarify what you mean by "immutable" and
"destructive". It's true that, since you have no public members in your
Set<T> class that modify the instance, the class can be considered
immutable. But, what's "destructive" about the .NET HashSet<T> type and
why is that a problem for you?

I should have saying 'mutable' instead of 'destructive'.
I like it to behave much more like a set in math theory rather than
original HashSet. So, for example when we want the outcome of the union
of a two sets, I would have a new set rather than modifying the original
set.
[...]
There is situation where late initialization is necessary (due to
expensive object creation, user interaction, etc), so I initially set
it to null, but I know later at some time this object will eventually
get initialized so I want the set to use the initialized object
instead - hence the creation of Reference<T> class. Although I think
Reference<T> class is general purpose enough to get the
reference-to-reference (?) semantic, but its use here is potentially
dangerous (you can have the same multiple objects inside the set which
defeat the purpose of HashSet).

Any thought?

I still don't know how you intend to use this Set<T> class. The code you
showed just demonstrates what you've already described. It doesn't
elaborate on what you're really trying to _do_. You've simply restated
the mechanical aspects of the solution you've already decided to
attempt, rather than the higher-level goal at hand.

I'm actually don't know how I would use this class. During reviewing
back my old Sudoku(*) program, it sparks my interest to create a Set<T>
class. Then, during unit testing, it makes me thinking, what would
happen if the set contains unintialized object. In real world, it is
something like an airplane that have a set of passengers (based on the
booked tickets). Some passenger already onboard while the others are
still waiting for clearance. Sooner or later, they will eventually make
it onboard but there may be others that will miss the flight. Then
during transit, some will stay while others may already reach their
destination.

* http://en.wikipedia.org/wiki/Sudoku
If object initialization is expensive, but you are still willing to
incur that cost later (when? is it just the first time you want to use
that object? some other trigger?), why not just build the deferred
initialization into the object class itself?

Yes, that what I would think. But sometimes there are situations where
they can afford the initialization and there are times that the
initialization needs to be deferred. I prefer not to modify the class
just to support the deferred initialization.
Alternatively, if you really want to store a data structure that refers
to some other data structure, why not just maintain a wrapper type? Your
original post hinted that you'd already tried that. I don't see how that
approach would be any more awkward than your current proposal. In fact,
a wrapper would be more convenient, because you could more easily make
it read/write, rather than just read-only (if desired).

I need a wrapper that is generic enough and my current wrapper does not
work properly well. It also feel a little awkward when I have to modify
the instance through a wrapper rather than original object (though that
is what the wrapper suppose to be...).
Finally, I still have no idea how the whole hash set thing works into
your design. What is it about these objects and the variables that
reference them that causes you to want to store the objects (or
references to variables that reference the objects) in a hash set? And
why does the hash set need to be updated automatically when an object is
finally initialized after having been deferred? Why can't you just add
objects to the hash set as they are initialized?


And besides all that, how do you expect to address this question of
having multiple references to the same object in the hash set. The way I
see it, you either hash on the reference to a reference (whatever form
that might take), in which case you can wind up with the same object
represented multiple times in the hash set, or you delegate the hash to
the object, in which case you wind up with objects in the hash set that
can change their hash value. Either scenario is not good _at all_.

Absent more information, I'm hesitant to criticize your proposed design
too much. But I have to say, the whole thing seems rather suspect. I
feel like if only you would share what the higher-level goal here is,
there would be a better way to address whatever issue you're trying to
solve than to try to reference a reference and store the reference to
the reference in a hash table.

Yeah, I think my thinking is flawed. The more I think about it, the more
it becomes clear why it should be done the other way. I think thats one
of the reason why C# does not have reference-to-reference semantic. At
the moment, I have no idea how my Set<T> class should be used. I just
imagining things that leads to flaws.

Thanks for your helpful thoughts as always.

Regards.
 
K

kndg

Thank you, Pete.
You had raised an interesting points and all your comments also makes
sense. I will study again about the point you make on immutability, and
will try to find a good way to implement it (time for a little googling).

Regards.

P.S:
By the way, I think you can be a good writer.
I have been following a blog from a great writer such as Eric Lippert,
Jon Skeet, and some other blogs from the people at Microsoft but your
blog seems long been un-updated.
A book perhaps? (I would interested to buy one!)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top