Binary Serialization overhead

B

bamelyan

My memory stream length, after bin serialization, was always coming out to be
too large. I started investigating. Even if i leave my GetDataObject()
method blank, no fields are serialized, the memory stream still came out to
be 143 bytes!

public void GetObjectData(SerializationInfo info, StreamingContext context)
{
// left blank, no fields of the class are begin serialized
}

How can I avoid this extra overhead?

Thanks!
 
P

Peter Duniho

My memory stream length, after bin serialization, was always coming out
to be
too large. I started investigating. Even if i leave my GetDataObject()
method blank, no fields are serialized, the memory stream still came out
to
be 143 bytes!

public void GetObjectData(SerializationInfo info, StreamingContext
context)
{
// left blank, no fields of the class are begin serialized
}

How can I avoid this extra overhead?

Don't use binary serialization.

You _might_ find, ironically enough, that using XML serialization and then
compressing the stream performs better. And this is especially likely to
be true if you do the serialization explicitly, rather than letting the
XML formatter do it.

If you want maximum performance, your best bet is to serialize explicitly
using BinaryWriter and BinaryReader, imposing your own explicit
application protocol on the stream. Then when you write a 32-bit int to
the stream, it actually winds up as 4 bytes.

Of course, if you're only ever putting this into a memory stream, it begs
the question as to why you need to serialize at all. Making a copy of the
object (for example) seems like a better approach anyway. So perhaps one
option is to not serialize at all, depending on your exact needs.

Pete
 
B

bamelyan

There are a few problems with what you are suggesting:

1) Binary serialization I found to be more efficient than XML one
2) Serializing XML and compressing imposes even larger performance penalty
3) if you want to pass object between two machines Serialization is necessary
4) serializing explicitely fixes layout of your structure. If you want to
add/remove/move around a field, you are doomed. Implementing ISerializable
interface on the object allows for maximum flexibility!

At this point, I am just looking to see if there is a way to reduce 143
bytes overhead, which I believe just carries my object's identifying
information for deserialization, e.g. Assembly, version, class name,
namespace, etc.

Thanks!
 
P

Peter Duniho

There are a few problems with what you are suggesting:

1) Binary serialization I found to be more efficient than XML one

That depends on how you serialize.
2) Serializing XML and compressing imposes even larger performance
penalty

Your original question made no mention of a time-cost performance issue.
Only a data-size-cost performance issue. That said, the fact is that a
simple GzipStream compression is an inconsequential cost in comparison to
actually moving data around, especially where a network is involved. If
you are just storing the data in-memory, maybe that's less of a concern to
you. But it doesn't mean that the cost of compression is actually a
problem.
3) if you want to pass object between two machines Serialization is
necessary

"Serialization" with a capital "S"? Absolutely not. Sure, you need to
convert your data into some format readable at the other end. But there's
definitely no requirement that you use the built-in .NET serialization
support.
4) serializing explicitely fixes layout of your structure. If you want
to
add/remove/move around a field, you are doomed. Implementing
ISerializable
interface on the object allows for maximum flexibility!

Only if you are going to include versioning in the protocol or always make
sure both ends of the connection are running the same version. These are
issues that exist no matter what serialization technique you use, explicit
or some implicit built-in approach.

The fact is, the less work you want to put into it yourself, the more
likely you are to have overhead problems. The .NET serialization
implementation includes features that address some of the concerns you
mention, but those features aren't free. They necessarily impose
additional overhead, and you are stuck with whatever other inefficiencies
might exist as well.

Maximum flexibility is mutually exclusive with maximum performance. You
need to decide which is more important to you.
At this point, I am just looking to see if there is a way to reduce 143
bytes overhead, which I believe just carries my object's identifying
information for deserialization, e.g. Assembly, version, class name,
namespace, etc.

There are ways to do that. You just don't seem to want to actually use
any of them.

Pete
 
B

bamelyan

Thanks for the prompt response, but unfortunately it does not address my
question.

Is there a way to reduce 143 bytes overhead, in a properly serialized
stream, which I believe just carries my object's identifying information for
deserialization, e.g. Assembly, version, class name, namespace, etc.?
 
P

Peter Duniho

[...]
Is there a way to reduce 143 bytes overhead, in a properly serialized
stream, which I believe just carries my object's identifying information
for
deserialization, e.g. Assembly, version, class name, namespace, etc.?

As long as you continue to use your particular definition of "properly
serialized stream" (a definition I don't agree with, obviously), the
answer is no. Those are all things that the serialization you're using
depends on for the features you say you want.

If you want the features, you have to live with the overhead.

Pete
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top