Making the correct class at runtime based on loaded data

J

Jon Rea

I hav been looking for the last 2 hours on how to do this without much luck.

Im going to give a simplifed model of the problem i have.

I want a collection class that can holds a series or objects, for arguments
sake, lets make these fruit :
apple
orange
bannana


in my application i have

public abstract class Fruit

{

float m_Weight = 0.0f;

public void setweight(float weight)

}

public class Apple : Fruit

{

}

public class Orange : Fruit

{

}

public class Bannana : Fruit

{

}


What i want to implement is a collection class that can hold a series of
items.

ArrayList theFruit = new ArrayList();

Now i want to open a file

pseudocode :

Openthefile(fruity.txt)
{
foreach line in the file
{
find the correct object type
create that object type
add the data to it
add it to the collection class
}
}

Now the file will contain a list something like this, i.e. a random list of
fruit.

Type, Weight
Bannana 1.03
Apple 2.12
Pear 4.23
Bannana 5.34
Bannana 1.22
Apple 3.33


How do i make the collection class contain objects of the correct type at
runtime ????

Many Thanks for anyones help
Jon Rea
 
J

Jon Skeet

Jon Rea said:
I hav been looking for the last 2 hours on how to do this without much luck.

Im going to give a simplifed model of the problem i have.

<snip>

Assuming they all have the same properties, the way to do it is:

o Use Type.GetType to get the type listed in the file.
o Check that the type has actually been found (i.e. Type.GetType hasn't
returned null).
o Use Activator.CreateInstance to create an instance of the type.
o Cast the returned reference to the base type (Fruit in your example)
o Call SetWeight etc with the rest of the data in the line.
 
N

Nicole Calinoiu

Jon,

There are design patterns that cover this sort of thing. Depending on the
details of your scenario, you might find either the abstract factory or the
factory method to be applicable (see
http://www.dofactory.com/patterns/Patterns.aspx for some C# examples).

However, if you feel this is too complex, and your situation is really as
simple as you present (and your pear is really meant to be an orange <g>), a
simple switch on the name found in the file should do the trick. e.g.:

<foreach line in the file>
{
Fruit newFruit = null;

string fruitName = <the name read from the line>;
switch (fruitName)
{
case "Apple":
{
newFruit = new Apple();
break;
}
case "Orange":
{
newFruit = new Orange();
break;
}
case "Banana":
{
newFruit = new Banana();
break;
}
default:
{
throw new ApplicationException(fruitName + " is not
a recognized fruit type.");
break;
}
}

newFruit.setweight(<weigth read from file, converted to a float
value>);
theFruit.Add(newFruit);
}

HTH,
Nicole
 
N

Nicole Calinoiu

And thereby allow potentially dangerous invocation of any validly named
object based on data coming from who knows where? At a bare minimum,
checking if the specified type is actually a Fruit subclass before creating
an instance might be a good idea. However, even that might not be
sufficient. For example, the data could specify the name of a class that
inherits from Fruit but is never meant to be instantiated based on the file
data. Such as class could be dangerous, or even just error out in
unexpected ways (e.g.: another abstract base class that inherits from
Fruit).

If one follows the usual "don't trust user data" rules, validating for
membership in the list of allowed class names is a much better approach. In
addition, this approach would allow for re-mapping of a name onto another
class if necessary (e.g.: "Apple" data is used to specify invocation of a
new FancyApple class).

Nicole
 
J

Jon Skeet

Nicole Calinoiu said:
And thereby allow potentially dangerous invocation of any validly named
object based on data coming from who knows where?

Well, if the type doesn't include the assembly name (which it won't if
it's taken from a comma-separated list as shown) then it can only come
from the executing assembly itself or mscorlib...
At a bare minimum,
checking if the specified type is actually a Fruit subclass before creating
an instance might be a good idea.

Sure. I was offering an outline, not bullet-proof production code.
However, even that might not be
sufficient. For example, the data could specify the name of a class that
inherits from Fruit but is never meant to be instantiated based on the file
data. Such as class could be dangerous, or even just error out in
unexpected ways (e.g.: another abstract base class that inherits from
Fruit).

If one follows the usual "don't trust user data" rules, validating for
membership in the list of allowed class names is a much better approach. In
addition, this approach would allow for re-mapping of a name onto another
class if necessary (e.g.: "Apple" data is used to specify invocation of a
new FancyApple class).

Yes. It really depends on exactly how this data was generated, the rest
of the application etc. There are times when you've got to trust user-
provided data (e.g. if the user could give their own fruit class, in
which case the OP would also need a way of loading the appropriate
assembly) and times where it's best not to. I was basically just
showing how to go about instantiating a type given only the name of the
type at runtime.

However, *if* all classes (and there could be many) provide an
identical constructor form, a simple map from name to type, followed by
the Activator.CreateInstance call shown before could be neater (albeit
fractionally slower) than the direct instantiation way you showed.
 
N

Nicole Calinoiu

Jon Skeet said:
Well, if the type doesn't include the assembly name (which it won't if
it's taken from a comma-separated list as shown)

Actually, it was space-delimited in the example, which doesn't preclude
specification of the assembly. Even comma-delimited could allow it if
individual data elements are wrapped in quotes.

then it can only come
from the executing assembly itself or mscorlib...

And there's nothing in mscorlib that one might not want to invoke like this?


Sure. I was offering an outline, not bullet-proof production code.

I did realize that. <g> Unfortunately, I've seen (and subsequently had to
fix) quite a bit of production code that uses "nifty" tricks like this
without any consideration of the run-time consequences, and I worry a lot
more about what the potential beginner readers of this thread will do in
their code than what you might do in yours.

Yes. It really depends on exactly how this data was generated, the rest
of the application etc.

Not really. All you know is that it's in a file. Unless you're monitoring
the file for potential tampering, and the monitor never been "turned off",
and you're absolutely certain it's unhackable, you have no idea how the data
got into the file.

There are times when you've got to trust user-
provided data

As a special priviledge within an application, maybe. Otherwise, I'd have
to disagree.
(e.g. if the user could give their own fruit class, in
which case the OP would also need a way of loading the appropriate
assembly) and times where it's best not to.

For this kind of extensibility, it might be best to allow the programmer
"user" to extend a factory class in order to allow instantion of their
custom objects. Personally, I wouldn't want to be blamed for their security
However, *if* all classes (and there could be many) provide an
identical constructor form, a simple map from name to type, followed by
the Activator.CreateInstance call shown before could be neater (albeit
fractionally slower) than the direct instantiation way you showed.

I agree, but I would use the following form to both enforce the target type
restrictions and get around the constructor limitation:

switch (theClassName)
{
case "a":
case "b":
{
newObject = Activator.CreateInstance(Type.GetType(theClassName));
break;
}
case "c":
case "d":
{
newObject = Activator.CreateInstance(Type.GetType(theClassName),
<some args>);
break;
}
case "e":
{
newObject = Activator.CreateInstance(Type.GetType(theClassName),
<some other args>);
break;
}
...
default:
{
throw <a big, fat exception>;
break;
}
}
 
J

Jon Skeet

Nicole Calinoiu said:
Actually, it was space-delimited in the example, which doesn't preclude
specification of the assembly.

True, whoops.
Even comma-delimited could allow it if
individual data elements are wrapped in quotes.

Assuming the code coped with quotes, of course - which it may not need
to.
And there's nothing in mscorlib that one might not want to invoke like this?

I never suggested that.

Not really. All you know is that it's in a file. Unless you're monitoring
the file for potential tampering, and the monitor never been "turned off",
and you're absolutely certain it's unhackable, you have no idea how the data
got into the file.

Are you also going to check somehow that the assembly hasn't been
tampered with either?
As a special priviledge within an application, maybe. Otherwise, I'd have
to disagree.

I guess we'll have to agree to disagree then.
For this kind of extensibility, it might be best to allow the programmer
"user" to extend a factory class in order to allow instantion of their
custom objects. Personally, I wouldn't want to be blamed for their security
holes. <g>

How do you then instantiate the factory class without running into the
same potential problem?
I agree, but I would use the following form to both enforce the target type
restrictions and get around the constructor limitation:

<snip>

I think it really depends on stuff we don't know anything about: how
many types are involved, how many different sets of parameters are
involved, etc. I generally find a table lookup more maintainable, but I
certainly wouldn't like to say without more information.
 
N

Nicole Calinoiu

Jon Skeet said:
this?

I never suggested that.

I didn't really mean to imply that you had. 'Twas meant to be a bit of
irony, but I guess that didn't really come through...

Are you also going to check somehow that the assembly hasn't been
tampered with either?


Well, .NET does have some lovely mechanisms for doing just that will very
little work on the behalf of the developer. I'd be even more worried about
tampering with the tracking log than with the assembly itself. That said, I
would never implement this sort of mechanism. IMO, it's quite a bit less
work to apply the validation that is necessary if one doesn't trust the
data.

I guess we'll have to agree to disagree then.

Different perspectives, I'd guess. I've probably spent way too long working
on web apps with privacy implications, where even app, machine, and db
admins may not read or modify most of the data. All my users are
potentially nosey snoops, and script kiddies might even outnumber legitimate
users. A few years of that tends to make one a bit cynical about user
input...


How do you then instantiate the factory class without running into the
same potential problem?

CAS with custom permissions and evidence. For example, a developer edition
of the software could include a token used as evidence in granting
permissions to call the protected and public members of the base factory
class.

I think it really depends on stuff we don't know anything about: how
many types are involved, how many different sets of parameters are
involved, etc. I generally find a table lookup more maintainable, but I
certainly wouldn't like to say without more information.

I sort of like the lookup approach too for large parameter sets,
particularly if it's reasonable to use the prototype pattern, but only if
the class set is substantially smaller than the parameter set. I would hope
that I would wake up and see the forest for the fruit before a 100-headed
monster creeps up on me... <g>
 
J

Jon Skeet

Different perspectives, I'd guess. I've probably spent way too long working
on web apps with privacy implications, where even app, machine, and db
admins may not read or modify most of the data. All my users are
potentially nosey snoops, and script kiddies might even outnumber legitimate
users. A few years of that tends to make one a bit cynical about user
input...

For webapps, certainly - although there when you're talking about users
you're talking about people without access to the file system, so
whereas you don't trust input that comes over the wire, you may be able
to trust data which you yourself have previously written to disk.
CAS with custom permissions and evidence. For example, a developer edition
of the software could include a token used as evidence in granting
permissions to call the protected and public members of the base factory
class.

All that is fine when it's absolutely needed - but frankly in most
situations I think it's going to be over the top.
I sort of like the lookup approach too for large parameter sets,
particularly if it's reasonable to use the prototype pattern, but only if
the class set is substantially smaller than the parameter set. I would hope
that I would wake up and see the forest for the fruit before a 100-headed
monster creeps up on me... <g>

Fair enough :)
 
N

Nicole Calinoiu

Jon Skeet said:
All that is fine when it's absolutely needed - but frankly in most
situations I think it's going to be over the top.

I most definitely agree. However, I do think that this particular situation
of instantiating objects based on their names is sufficiently dangerous that
adequate protection is necessary. My preference would be to validate the
externally sourced names against a strict inclusion list and prevent any
external use of the factory classes.

If external use is necessary (e.g.: because of an extensibility
requirement), there's probably some simpler mechanism for ensuring that use
from external code is reasonably safe. I haven't yet encountered a need to
expose such potentially dangerous code for extensibility in my own work, so
I haven't really dug deep into the potential ramifications of just allowing
OOP "tricks" and standard CAS to handle the issue.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top