A bug in .Net Binary Serialization?

Z

ztRon

Hi all,

I recently came across something really strange and after a couple of days
of debugging, I finally nailed the cause of it. However, I have absolutely no
idea what I am doing wrong or is it just a bug in binary serialization. The
following is a simple example of the code:




using System;
using System.Collections.Generic;
using System.IO;
using System.Runtime.Serialization.Formatters.Binary;

namespace ConsoleApplication5
{
class Program
{
static void Main(string[] args)
{
A a = new A();
B b = new B(a);
List<C> cList = new List<C>();
for (int i = 0; i < 10000; i++)
{
cList.Add(new C("someValue"));
}
b.CList = cList;

MemoryStream stream = new MemoryStream();
BinaryFormatter objFormatter = new BinaryFormatter();
objFormatter.Serialize(stream, b);
}
}

[Serializable]
class A
{
private Dictionary<string, string> _dic1 = new Dictionary<string,
string>();

public A()
{
_dic1.Add("key1", "value1");
_dic1.Add("key2", "value2");
}
}

[Serializable]
class B
{
private List<C> _cList = new List<C>();
private A _a;

public B(A a)
{
_a = a;
}

public List<C> CList
{
get { return _cList; }
set { _cList = value; }
}
}

[Serializable]
class C
{
private Dictionary<string, string> _dic2 = new Dictionary<string,
string>();
private string _value;

public C(string value)
{
_value = value;
}
}
}































































If you run the code, you will find that the stream has a length of 4,532,517
bytes. Now, try changing _dic1(Class A) to be a Dictionary<string, object>
and run the code again. Now, the stream length is 462,924 bytes. Why is there
such a big difference just by changing the type? What I noticed also was that
this might be due to the fact that I have another dictionary of the same type
in Class C.

Am I doing something wrong here? If not, is this a bug?

Thanks in advance!!
 
Z

ztRon

Sorry my post seems to have a huge white space in between. Reposting it below:

Hi all,

I recently came across something really strange and after a couple of days
of debugging, I finally nailed the cause of it. However, I have absolutely no
idea what I am doing wrong or is it just a bug in binary serialization. The
following is a simple example of the code:

using System;
using System.Collections.Generic;
using System.IO;
using System.Runtime.Serialization.Formatters.Binary;

namespace ConsoleApplication5
{
class Program
{
static void Main(string[] args)
{
A a = new A();
B b = new B(a);
List<C> cList = new List<C>();
for (int i = 0; i < 10000; i++)
{
cList.Add(new C("someValue"));
}
b.CList = cList;

MemoryStream stream = new MemoryStream();
BinaryFormatter objFormatter = new BinaryFormatter();
objFormatter.Serialize(stream, b);
}
}

[Serializable]
class A
{
private Dictionary<string, string> _dic1 = new Dictionary<string,
string>();

public A()
{
_dic1.Add("key1", "value1");
_dic1.Add("key2", "value2");
}
}

[Serializable]
class B
{
private List<C> _cList = new List<C>();
private A _a;

public B(A a)
{
_a = a;
}

public List<C> CList
{
get { return _cList; }
set { _cList = value; }
}
}

[Serializable]
class C
{
private Dictionary<string, string> _dic2 = new Dictionary<string,
string>();
private string _value;

public C(string value)
{
_value = value;
}
}
}

If you run the code, you will find that the stream has a length of 4,532,517
bytes. Now, try changing _dic1(Class A) to be a Dictionary<string, object>
and run the code again. Now, the stream length is 462,924 bytes. Why is there
such a big difference just by changing the type? What I noticed also was that
this might be due to the fact that I have another dictionary of the same type
in Class C.

Am I doing something wrong here? If not, is this a bug?

Thanks in advance!!
 
Z

ztRon

It would be interesting to try to serialize to a more readable format and
see what the specific differences are. I don't have the time at the
moment to explore too much, but it's something you might like to try.

This example was actually derived from a more complex code if that was what
you meant. And in my unit testing of it, I noticed that the size recently
tripled due to the addition of one dictionary even though there is only ever
one instance of it. This was when I started to debug and finally were able to
pinpoint its cause and came out with a simpler example to express this
problem.
 
Z

ztRon

The problem with something like the XmlSerializer is that it does not support
serialization of dictionaries.

Does anyone else have any other ideas to this problem?

Thanks.
 
Z

ztRon

But isn't it the same with SOAP? I think SOAP does not support Generics which
thus means that it doesn't support dictionaries?
 
Z

ztRon

I actually did that yesterday using the SOAPFormatter and it did not work,
which was why I thought maybe you meant something else.
 
S

SMJT

Hi all,

I recently came across something really strange and after a couple of days
of debugging, I finally nailed the cause of it. However, I have absolutely no
idea what I am doing wrong or is it just a bug in binary serialization. The
following is a simple example of the code:

using System;  
using System.Collections.Generic;  
using System.IO;  
using System.Runtime.Serialization.Formatters.Binary;  

namespace ConsoleApplication5  
{  
    class Program  
    {  
        static void Main(string[] args)  
        {  
            A a = new A();  
            B b = new B(a);  
            List<C> cList = new List<C>();  
            for (int i = 0; i < 10000; i++)  
            {  
                cList.Add(new C("someValue"));  
            }  
            b.CList = cList;  

            MemoryStream stream = new MemoryStream();  
            BinaryFormatter objFormatter = new BinaryFormatter();  
            objFormatter.Serialize(stream, b);  
        }  
    }  

    [Serializable]  
    class A  
    {  
        private Dictionary<string, string> _dic1 = new Dictionary<string,
string>();  

        public A()  
        {  
            _dic1.Add("key1", "value1");  
            _dic1.Add("key2", "value2");  
        }  
    }  

    [Serializable]  
    class B  
    {  
        private List<C> _cList = new List<C>();  
        private A _a;  

        public B(A a)  
        {  
            _a = a;  
        }  

        public List<C> CList  
        {  
            get { return _cList; }  
            set { _cList = value; }  
        }  
    }  

    [Serializable]  
    class C  
    {  
        private Dictionary<string, string> _dic2 = new Dictionary<string,
string>();  
        private string _value;  

        public C(string value)  
        {  
            _value = value;  
        }  
    }  

}  

If you run the code, you will find that the stream has a length of 4,532,517
bytes. Now, try changing _dic1(Class A) to be a Dictionary<string, object>
and run the code again. Now, the stream length is 462,924 bytes. Why is there
such a big difference just by changing the type? What I noticed also was that
this might be due to the fact that I have another dictionary of the same type
in Class C.

Am I doing something wrong here? If not, is this a bug?

Thanks in advance!!

ztRon,

I don't think this is a bug, but just the way the data is stored when
you use binary serialization.

I had a similar problem when serializing classes to a file, where my
class contained an array of strings. If the string values were all the
same, then only one copy of the string was stored rather than multiple
copies of the same string (which I think is quiet clever really, saves
space and is probably quicker or something).

My bug was that when I changed one of the strings the serialized class
size changed so shouldn't have been writen back to the same slot in my
file and I ended up corrupting my data file.

So I think you don't have a bug, just a feature of binary
serialization.

SMJT
 
N

not_a_commie

I think you'll have better luck with the newer DataContractSerializer.
It works way better than the older serialization stuff. Here's a cut
from my code.

public byte[] GetDataBytes(params Type[] types)
{
var ds = new DataContractSerializer(GetType(), types);
using (var mem = new MemoryStream())
{
//using (var w = XmlDictionaryWriter.CreateTextWriter(mem)) // for
xml
using (var w = XmlDictionaryWriter.CreateBinaryWriter(mem))
{
ds.WriteObject(w, this);
}
return mem.ToArray();
}
}

And I prefer the DataContract and DataMember attributes more than the
Serializable one.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top