Fast binary serialization/deserialization of object

schoenfeld1 · Sep 20, 2005

I've implemented IPC between two applications using named pipes and
binary serialization, but have noticed that the binary formatter is
rather slow.

It seems that the binary formatter reflects the entire type everytime
it is invoked to serialize/deserialize an object of that type.

Is there a way to prepare the binary formatter with a pre-defined type,
such that it only reflects once but can be re-used to
serialize/deserialize objects of that type without performing
reflection again?

Here is the relevant code:

static public byte[] Serialize(Message message) {
#region Pre-conditions
Debug.Assert(message != null);
#endregion
MemoryStream memStream = new MemoryStream(1024);
BinaryFormatter binFormatter = new BinaryFormatter();
binFormatter.Serialize(memStream, message);
return memStream.GetBuffer();
}

static public Message Deserialize(byte[] data) {
#region Pre-conditions
Debug.Assert(data != null);
#endregion
BinaryFormatter binFormatter = new BinaryFormatter();
MemoryStream memStream = new MemoryStream(data);
object obj = binFormatter.Deserialize(memStream);
Debug.Assert(obj is Message);
return obj as Message;
}

Any assistance is appreciated.

Steve Walker · Sep 20, 2005

In message said:
I've implemented IPC between two applications using named pipes and
binary serialization, but have noticed that the binary formatter is
rather slow.

It seems that the binary formatter reflects the entire type everytime
it is invoked to serialize/deserialize an object of that type.

Is there a way to prepare the binary formatter with a pre-defined type,
such that it only reflects once but can be re-used to
serialize/deserialize objects of that type without performing
reflection again?

Implement ISerializable and write your own serialization code. If you're
passing anything substantial you can get a significant performance
improvement. The downside is the potential maintenance costs, since you
will probably have to touch your serialization code whenever members are
added to your class. I wouldn't do it unless I had to, and I think I've
only ever done it once in production code.

schoenfeld1 · Sep 20, 2005

Steve said:
Implement ISerializable and write your own serialization code. If you're
passing anything substantial you can get a significant performance
improvement. The downside is the potential maintenance costs, since you
will probably have to touch your serialization code whenever members are
added to your class. I wouldn't do it unless I had to, and I think I've
only ever done it once in production code.

This is not a problem at all. My objects are not complex at all. They
are just frequently serialized/deserialized. Do you think custom
serialization would provide significant performance boost in this case,
or should I just implement my own object serialization?

Steve Walker · Sep 20, 2005

In message said:
Steve Walker wrote:

This is not a problem at all. My objects are not complex at all. They
are just frequently serialized/deserialized. Do you think custom
serialization would provide significant performance boost in this case,
or should I just implement my own object serialization?

I think you'll probably have to benchmark it to find out.

If I remember rightly, the project I used it in was serializing a large
collection of Person objects, each of which contained subobjects. We
flattened each person and subobjects into a struct and stuffed an array
of those into SerializationInfo. I've no idea whether what we did was
optimal or not, there wasn't time to experiment further, but it did get
us a massive improvement in performance over the default mechanism.

If you aren't asking too much of the default mechanism (as we were) to
begin with, the gain may be less significant.

schoenfeld1 · Sep 20, 2005

Steve said:
I think you'll probably have to benchmark it to find out.

If I remember rightly, the project I used it in was serializing a large
collection of Person objects, each of which contained subobjects. We
flattened each person and subobjects into a struct and stuffed an array
of those into SerializationInfo. I've no idea whether what we did was
optimal or not, there wasn't time to experiment further, but it did get
us a massive improvement in performance over the default mechanism.

If you aren't asking too much of the default mechanism (as we were) to
begin with, the gain may be less significant.

Thanks a lot for your help. I've noticed a sufficient performance
increase with custom serialization rendering this a non-issue.

schoenfeld1 · Sep 20, 2005

I've implemented IPC between two applications using named pipes and
binary serialization, but have noticed that the binary formatter is
rather slow.

It seems that the binary formatter reflects the entire type everytime
it is invoked to serialize/deserialize an object of that type.

Is there a way to prepare the binary formatter with a pre-defined type,
such that it only reflects once but can be re-used to
serialize/deserialize objects of that type without performing
reflection again?

Here is the relevant code:

static public byte[] Serialize(Message message) {
#region Pre-conditions
Debug.Assert(message != null);
#endregion
MemoryStream memStream = new MemoryStream(1024);
BinaryFormatter binFormatter = new BinaryFormatter();
binFormatter.Serialize(memStream, message);
return memStream.GetBuffer();
}

static public Message Deserialize(byte[] data) {
#region Pre-conditions
Debug.Assert(data != null);
#endregion
BinaryFormatter binFormatter = new BinaryFormatter();
MemoryStream memStream = new MemoryStream(data);
object obj = binFormatter.Deserialize(memStream);
Debug.Assert(obj is Message);
return obj as Message;
}

Any assistance is appreciated.

For future lurkers, I've corrected 2 bugs found in previous code
snippet (i.e. closing streams when done and not pre-allocating
memstream size when serializing).

static public byte[] Serialize(Message message) {
#region Pre-conditions
Debug.Assert(message != null);
#endregion
MemoryStream memStream = new MemoryStream();
BinaryFormatter binFormatter = new BinaryFormatter();
binFormatter.Serialize(memStream, message);
byte[] data = memStream.GetBuffer();
memStream.Close();
return data;
}

static public Message Deserialize(byte[] data) {
#region Pre-conditions
Debug.Assert(data != null);
#endregion
BinaryFormatter binFormatter = new BinaryFormatter();
MemoryStream memStream = new MemoryStream(data);
object obj = binFormatter.Deserialize(memStream);
memStream.Close();
Debug.Assert(obj is Message);
return obj as Message;
}

Michael S · Sep 20, 2005

This is not a problem at all. My objects are not complex at all. They
are just frequently serialized/deserialized. Do you think custom
serialization would provide significant performance boost in this case,
or should I just implement my own object serialization?

If it's fairly simple objects I would implement my own serialization.

Also, if the objects are constantly being created, loaded, used, saved and
forgotten; I would consider using an object-pool for unused objects and
re-use instances rather than creating new ones.

Something like this:

myObj = MyObjPool.Pop();
if (myObj == null)
{
myObj.Create();
}
myObj.Load(someId);
myObj.Use();
myObj.Save(someId)
MyPool.Push(myObj);

Happy Pooling
- Michael S

Steve Walker · Sep 20, 2005

In message said:
Steve Walker wrote:

Thanks a lot for your help. I've noticed a sufficient performance
increase with custom serialization rendering this a non-issue.

Welcome.

Just had a play with this. Looks like a collection of:

class Foo
{
int a;
int b;
int c;
int d;
int e;
int f;
string g;
string h;
string i;
string j;
string k;
string l;
}

serializes a lot faster than a collection of:

class Bar
{
private Ints ints;
string g;
string h;
string i;
string j;
string k;
string l;
private class Ints
{
int a;
int b;
int c;
int d;
int e;
int f;
}
}

and so performance-wise it's worthwhile implementing ISerializable in
the collection:

Assuming we've defined (Foo) and (Bar) as operators to convert between
the two:

public BarCollection(SerializationInfo si, StreamingContext context)
{
Foo[] foos = (Foo[])si.GetValue("stuff", typeof(Foo[]));

foreach(Foo item in foos)
{
this.Add((Bar)item);
}
}

public void GetObjectData(SerializationInfo info,
StreamingContext context)
{
Foo[] foos = new Foo[this.Count];

for(int i=0; i<this.Count; i++)
{
foos = (Foo)this;
}

info.AddValue("stuff",foos);
}

I still wouldn't do it unless something was really too slow, but
interesting all the same.

So, anybody know enough about how the default binary serialization
mechanism works under the hood to explain the difference?

Problem with BinaryFormatter Deserialize method	10	Sep 23, 2009
How does tcpclient.getstream know when a serialized objects ends?	2	Oct 16, 2009
Deserialize a string	5	Jan 15, 2010
serialize a file which is stored in a memorystream object	3	Aug 17, 2007
Binary Serialization over the net	1	Nov 7, 2005
a question about binary serialization	3	Apr 21, 2011
Deserialize (TargetInvocationException)	0	May 5, 2008
Handling severe packet loss (serialization).	10	Mar 18, 2010

Fast binary serialization/deserialization of object

schoenfeld1

Steve Walker

schoenfeld1

Steve Walker

schoenfeld1

schoenfeld1

Michael S

Steve Walker

Ask a Question

Similar Threads