XmlSerializer replacing cr+nl with nl when loading

A

Andrew

I am using an XmlSerializer to save some settings and I have discovered that
when a string is saved containing a cr+nl it is replaced with just a newline
when loading back in. I am no expert with XML and I wonder if there is a
option I have to change to fix this. I have written some simple test code
to show the problem.

Can anyone please tell me why the cr+nl is being replaced by just a newline
and how to stop this happening?

Thanks


public class SettingsClass
{
private string _TestString;
public string TestString
{
get
{
return _TestString;
}
set
{
_TestString = value;
}
}
}

private void butTest_Click(object sender, EventArgs e)
{
SettingsClass Settings = new SettingsClass();

Settings.TestString = "abc" + Environment.NewLine + "xyz";

// Save the settings
using (StreamWriter Writer = new
StreamWriter(@"c:\work\test.xml"))
{
XmlSerializer Serializer = new
XmlSerializer(typeof(SettingsClass));

Serializer.Serialize(Writer, Settings);
}

// This shows the string has 8 chars
MessageBox.Show(Settings.TestString.Length.ToString());

// Look at the c:\work\test.xml file at this point and we see it
does
// contain a cr+newline between the abc and xyz

// Load the settings
using (FileStream Reader = new FileStream(
@"c:\work\test.xml", FileMode.Open, FileAccess.Read,
FileShare.Read))
{
XmlSerializer Serializer = new
XmlSerializer(typeof(SettingsClass));

Settings = (SettingsClass)Serializer.Deserialize(Reader);
}

// This shows the string is now only 7 chars in length.
// Only the newline remains, the cr has been stripped out!
MessageBox.Show(Settings.TestString.Length.ToString());
}
 
P

Pavel Minaev

Andrew said:
I am using an XmlSerializer to save some settings and I have discovered
that when a string is saved containing a cr+nl it is replaced with just a
newline when loading back in. I am no expert with XML and I wonder if
there is a option I have to change to fix this. I have written some simple
test code to show the problem.

Can anyone please tell me why the cr+nl is being replaced by just a
newline and how to stop this happening?

Thanks


public class SettingsClass
{
private string _TestString;
public string TestString
{
get
{
return _TestString;
}
set
{
_TestString = value;
}
}
}

private void butTest_Click(object sender, EventArgs e)
{
SettingsClass Settings = new SettingsClass();

Settings.TestString = "abc" + Environment.NewLine + "xyz";

// Save the settings
using (StreamWriter Writer = new
StreamWriter(@"c:\work\test.xml"))
{
XmlSerializer Serializer = new
XmlSerializer(typeof(SettingsClass));

Serializer.Serialize(Writer, Settings);
}

// This shows the string has 8 chars
MessageBox.Show(Settings.TestString.Length.ToString());

// Look at the c:\work\test.xml file at this point and we see
it does
// contain a cr+newline between the abc and xyz

// Load the settings
using (FileStream Reader = new FileStream(
@"c:\work\test.xml", FileMode.Open, FileAccess.Read,
FileShare.Read))
{
XmlSerializer Serializer = new
XmlSerializer(typeof(SettingsClass));

Settings = (SettingsClass)Serializer.Deserialize(Reader);
}

// This shows the string is now only 7 chars in length.
// Only the newline remains, the cr has been stripped out!
MessageBox.Show(Settings.TestString.Length.ToString());
}

Can't really explain this behavior - it sounds a bit like XML whitespace
normalization and end-of-line handling, but whitespace in serialized strings
should still be handled correctly.

However, here's something from a Google search... try creating an
XmlSerializer on top of XmlWriter rather than directly on top of
StreamWriter. If you don't do that, XmlSerializer creates an XmlWriter
itself, and, apparently, it uses the old obsolete XmlXxxWriter classes
rather than XmlWriter.Create(), which results in different implementation
used.
 
M

Martin Honnen

Andrew said:
// Load the settings
using (FileStream Reader = new FileStream(
@"c:\work\test.xml", FileMode.Open, FileAccess.Read,
FileShare.Read))
{
XmlSerializer Serializer = new
XmlSerializer(typeof(SettingsClass));

Settings = (SettingsClass)Serializer.Deserialize(Reader);
}

// This shows the string is now only 7 chars in length.
// Only the newline remains, the cr has been stripped out!
MessageBox.Show(Settings.TestString.Length.ToString());

Try whether using an XmlTextReader with Normalization set to false helps:
XmlTextReader reader = new XmlTextReader(@"c:\work\test.xml");
reader.Normalization = false;
Settings = (SettingsClass)Serializer.Deserialize(reader);

An XML parser is supposed to normalize crlf to lf, see
http://www.w3.org/TR/xml/#sec-line-ends, so you have to explicitly turn
that off if you don't want it.
 
P

Pavel Minaev

Martin Honnen said:
Try whether using an XmlTextReader with Normalization set to false helps:
XmlTextReader reader = new XmlTextReader(@"c:\work\test.xml");
reader.Normalization = false;
Settings = (SettingsClass)Serializer.Deserialize(reader);

An XML parser is supposed to normalize crlf to lf, see
http://www.w3.org/TR/xml/#sec-line-ends, so you have to explicitly turn
that off if you don't want it.

An XML parser is supposed to normalize CR, true, but not if it's entered as
an entity reference - i.e.,
- and I would fully expect
XmlWriter.WriteString to escape CR using

Of course, I might well be wrong.
 
A

Andrew

Thanks. The following code fixes the problem. I didn't have to set the
Normalization option though. Just using the XmlTextReader seems to be
enough.

// Load the settings
using (FileStream Reader = new FileStream(
@"c:\work\test.xml", FileMode.Open, FileAccess.Read,
FileShare.Read))
{
XmlTextReader XmlReader = new XmlTextReader(Reader);

XmlSerializer Serializer = new
XmlSerializer(typeof(SettingsClass));

Settings = (SettingsClass)Serializer.Deserialize(XmlReader);
}
 
A

Andrew

Thanks Pavel.

I tried this but the xml file created still has a real cr+lf in the text so
it still loads the same. I was expecting these chars to be represented by
an escape sequence maybe.

Martin's solution to change the reader side did work though.

The question still remains whether the proper way to do it is to have real
cr+lf in the file or an escape sequence?
 
M

Martin Honnen

Andrew said:
Thanks. The following code fixes the problem. I didn't have to set the
Normalization option though. Just using the XmlTextReader seems to be
enough.

That Normalization property is false by default on XmlTextReader so that
explains why you did not have to set it explicitly to false.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top