XmlDocument.Save() with null XmlResolver modifies DOCTYPE tag

E

eXavier

Hello,
I need to load XML, which contains reference to DTD, make some schanges
through DOM and save it back. I need to avoid resolving the DTD address as
the machine has no access to internet. Following snippet demonstrates my
approach:

string txt = File.ReadAllText(@"C:\dfs.xml");
XmlDocument doc = new XmlDocument();
doc.XmlResolver = null;
doc.LoadXml(txt);
doc.Save(@"C:\dfs2.xml");

The XML begins with:
<?xml version="1.0" encoding="utf-8" standalone="no"?>
<!DOCTYPE abc SYSTEM "http://www.abc.com/dtd/abc_v2.2.dtd">
...

But in the output XML, there are empty brackets added behind DTD reference,
which causes final consumer of XML to fail:
<?xml version="1.0" encoding="utf-8" standalone="no"?>
<!DOCTYPE abc SYSTEM "http://www.abc.com/dtd/abc_v2.2.dtd"[]>
...

Is it bug? There's no reference to such behavior in MSDN...

Thanks
eXavier
 
L

Lingzhi Sun [MSFT]

Hello eXavier,

Welcome to Microsoft Newsgroup Support service. My name is Lingzhi Sun
[MSFT]. I am very pleased to work with you on this case.

The issue that you reported has been proven to be a product issue, which
can be found at this Microsoft Connect webpage:
https://connect.microsoft.com/VisualStudio/feedback/ViewFeedback.aspx?Feedba
ckID=122168. Here I'd like to analyze the cause of this problem and then
provide a workaround for you.

//////////////////////////
Cause of the problem:

The underlying reader used by XmlDocument (ie. XmlTextReader) does not
distinguish between a document with an empty internal subset and one with
no internal subset specified. In both cases the Value property for the
XmlTextReader is an empty string (""). When the XmlDocument.Save() method
is called, it will call XmlTextWriter.WriteDocType() method to write
document state of the XML file. I researched the codes of this method and
found that if a non-null value is passed to the parameter subset, the
XmlTextWriter writes [subset]; if a null value is passed, the XmlTextWriter
does not write anything. Therefore, if the XmlTextReader.Value is an
empty string ("") and it is passed to the parameter subset, we will get [""
(an empty string)] at the end of the document statement in the XML file.

/////////////////////////
Workaround:

To work around this issue, we can call XmlTextWriter.WriteDocType() method
to write document statement in the XML file instead of using
XmlDocument.Save(). For detail, please see the following code snippet:

/////////////////////////////////////////////
try
{
using (XmlTextReader reader = new XmlTextReader(@"dfs.xml"))
{
reader.XmlResolver = null;
using (XmlTextWriter writer = new XmlTextWriter(@"dfs2.xml",
Encoding.UTF8))
{
int num = (reader.NodeType == XmlNodeType.None) ? -1 : reader.Depth;
do
{
switch (reader.NodeType)
{
case XmlNodeType.Element:
writer.WriteStartElement(reader.Prefix, reader.LocalName,
reader.NamespaceURI);
writer.WriteAttributes(reader, true);
if (reader.IsEmptyElement)
{
writer.WriteEndElement();
}
break;

case XmlNodeType.Text:
writer.WriteString(reader.Value);
break;

case XmlNodeType.CDATA:
writer.WriteCData(reader.Value);
break;

case XmlNodeType.EntityReference:
writer.WriteEntityRef(reader.Name);
break;

case XmlNodeType.ProcessingInstruction:
case XmlNodeType.XmlDeclaration:
writer.WriteProcessingInstruction(reader.Name, reader.Value);
break;

case XmlNodeType.Comment:
writer.WriteComment(reader.Value);
break;

case XmlNodeType.DocumentType:
if (reader.Value.Equals(string.Empty))
{
writer.WriteDocType(reader.Name, reader.GetAttribute("PUBLIC"),
reader.GetAttribute("SYSTEM"), null);
}
else
writer.WriteDocType(reader.Name, reader.GetAttribute("PUBLIC"),
reader.GetAttribute("SYSTEM"), reader.Value);
break;

case XmlNodeType.Whitespace:
case XmlNodeType.SignificantWhitespace:
writer.WriteWhitespace(reader.Value);
break;

case XmlNodeType.EndElement:
writer.WriteFullEndElement();
break;
}
}
while (reader.Read() && ((num < reader.Depth) || ((num ==
reader.Depth) && (reader.NodeType == XmlNodeType.EndElement))));
}
}
}
catch (Exception ex)
{
MessageBox.Show(ex.Message);
}
///////////////////////////////////////////////

If you have any questions, please be free to let me know. Have a nice day!

Regards,
Lingzhi Sun ([email protected], remove 'online.')
Microsoft Online Community Support

Delighting our customers is our #1 priority. We welcome your comments and
suggestions about how we can improve the support we provide to you. Please
feel free to let my manager know what you think of the level of service
provided. You can send feedback directly to my manager at:
(e-mail address removed).

==================================================
Get notification to my posts through email? Please refer to
http://msdn.microsoft.com/en-us/subscriptions/aa948868.aspx#notifications.

MSDN Managed Newsgroup support offering is for non-urgent issues where an
initial response from the community or a Microsoft Support Engineer within
2 business day is acceptable. Please note that each follow up response may
take approximately 2 business days as the support professional working with
you may need further investigation to reach the most efficient resolution.
The offering is not appropriate for situations that require urgent,
real-time or phone-based interactions. Issues of this nature are best
handled working with a dedicated Microsoft Support Engineer by contacting
Microsoft Customer Support Services (CSS) at
http://msdn.microsoft.com/en-us/subscriptions/aa948874.aspx
==================================================
This posting is provided "AS IS" with no warranties, and confers no rights.
 
E

eXavier

Hello Lingzhi Sun,
Thanks for your response. I appreciate your explanation of the problem.
Based on it I created class derived from XmlTextWriter, which overrides
WriteDocType() method and just changes the value of subset parameter to null
if it was empty string and calls it's base method implementation. Then I call
XmlDocument.Save(TextWriter) passing in instance of my custom XmlTextWriter.

I tried the workaround you proposed and it works well. Anyway, I checked
number of items in XmlNodeType enumeration and there are some more, not
coverd by the snippet you've sent. So it seems to me using the custom
XmlTextWriter is simpler and also means less coding.

Anyway, once again - thanks a lot for your response!
eXavier

Lingzhi Sun said:
Hello eXavier,

Welcome to Microsoft Newsgroup Support service. My name is Lingzhi Sun
[MSFT]. I am very pleased to work with you on this case.

The issue that you reported has been proven to be a product issue, which
can be found at this Microsoft Connect webpage:
https://connect.microsoft.com/VisualStudio/feedback/ViewFeedback.aspx?Feedba
ckID=122168. Here I'd like to analyze the cause of this problem and then
provide a workaround for you.

//////////////////////////
Cause of the problem:

The underlying reader used by XmlDocument (ie. XmlTextReader) does not
distinguish between a document with an empty internal subset and one with
no internal subset specified. In both cases the Value property for the
XmlTextReader is an empty string (""). When the XmlDocument.Save() method
is called, it will call XmlTextWriter.WriteDocType() method to write
document state of the XML file. I researched the codes of this method and
found that if a non-null value is passed to the parameter subset, the
XmlTextWriter writes [subset]; if a null value is passed, the XmlTextWriter
does not write anything. Therefore, if the XmlTextReader.Value is an
empty string ("") and it is passed to the parameter subset, we will get [""
(an empty string)] at the end of the document statement in the XML file.

/////////////////////////
Workaround:

To work around this issue, we can call XmlTextWriter.WriteDocType() method
to write document statement in the XML file instead of using
XmlDocument.Save(). For detail, please see the following code snippet:

/////////////////////////////////////////////
try
{
using (XmlTextReader reader = new XmlTextReader(@"dfs.xml"))
{
reader.XmlResolver = null;
using (XmlTextWriter writer = new XmlTextWriter(@"dfs2.xml",
Encoding.UTF8))
{
int num = (reader.NodeType == XmlNodeType.None) ? -1 : reader.Depth;
do
{
switch (reader.NodeType)
{
case XmlNodeType.Element:
writer.WriteStartElement(reader.Prefix, reader.LocalName,
reader.NamespaceURI);
writer.WriteAttributes(reader, true);
if (reader.IsEmptyElement)
{
writer.WriteEndElement();
}
break;

case XmlNodeType.Text:
writer.WriteString(reader.Value);
break;

case XmlNodeType.CDATA:
writer.WriteCData(reader.Value);
break;

case XmlNodeType.EntityReference:
writer.WriteEntityRef(reader.Name);
break;

case XmlNodeType.ProcessingInstruction:
case XmlNodeType.XmlDeclaration:
writer.WriteProcessingInstruction(reader.Name, reader.Value);
break;

case XmlNodeType.Comment:
writer.WriteComment(reader.Value);
break;

case XmlNodeType.DocumentType:
if (reader.Value.Equals(string.Empty))
{
writer.WriteDocType(reader.Name, reader.GetAttribute("PUBLIC"),
reader.GetAttribute("SYSTEM"), null);
}
else
writer.WriteDocType(reader.Name, reader.GetAttribute("PUBLIC"),
reader.GetAttribute("SYSTEM"), reader.Value);
break;

case XmlNodeType.Whitespace:
case XmlNodeType.SignificantWhitespace:
writer.WriteWhitespace(reader.Value);
break;

case XmlNodeType.EndElement:
writer.WriteFullEndElement();
break;
}
}
while (reader.Read() && ((num < reader.Depth) || ((num ==
reader.Depth) && (reader.NodeType == XmlNodeType.EndElement))));
}
}
}
catch (Exception ex)
{
MessageBox.Show(ex.Message);
}
///////////////////////////////////////////////

If you have any questions, please be free to let me know. Have a nice day!

Regards,
Lingzhi Sun ([email protected], remove 'online.')
Microsoft Online Community Support

Delighting our customers is our #1 priority. We welcome your comments and
suggestions about how we can improve the support we provide to you. Please
feel free to let my manager know what you think of the level of service
provided. You can send feedback directly to my manager at:
(e-mail address removed).

==================================================
Get notification to my posts through email? Please refer to
http://msdn.microsoft.com/en-us/subscriptions/aa948868.aspx#notifications.

MSDN Managed Newsgroup support offering is for non-urgent issues where an
initial response from the community or a Microsoft Support Engineer within
2 business day is acceptable. Please note that each follow up response may
take approximately 2 business days as the support professional working with
you may need further investigation to reach the most efficient resolution.
The offering is not appropriate for situations that require urgent,
real-time or phone-based interactions. Issues of this nature are best
handled working with a dedicated Microsoft Support Engineer by contacting
Microsoft Customer Support Services (CSS) at
http://msdn.microsoft.com/en-us/subscriptions/aa948874.aspx
==================================================
This posting is provided "AS IS" with no warranties, and confers no rights.
 
L

Lingzhi Sun [MSFT]

Hello eXavier,

You are welcome. I am glad to hear that the problem is resolved. Thank
you for your quick feedback.

Have a nice weekend!

Regards,
Lingzhi Sun ([email protected], remove 'online.')
Microsoft Online Community Support

=================================================
Delighting our customers is our #1 priority. We welcome your comments and
suggestions about how we can improve the support we provide to you. Please
feel free to let my manager know what you think of the level of service
provided. You can send feedback directly to my manager at:
(e-mail address removed).

This posting is provided "AS IS" with no warranties, and confers no rights.
=================================================
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top