Faster XML Processing...

DMan · Mar 6, 2004

Need some help on how to make the following faster....

Public XmlDocument ProcessXML( XmlDocument xmlData )
{
XmlNode originalXML = xmlData.Clone();

try
{
// Process XML data here...
}
catch
{
// if ANYTHING goes wrong with XML processing
// revert back to the original.
xmlData.LoadXml(originalXML.OuterXml);
return xmlData;
}

return xmlData;
}

There it is. Without cloning and reloading in the catch block, if something
goes wrong, I get SOME xml processing, which I absolutely cannot have. For
example, if some data in the XML is invalid, I throw an exception, but I
still get back all the XML processed before that error ocurred. I need to
either have the entire XML document processed error free, or revert back to
the original (as is being done in the code) if there is any error anywhere.
Can anyone think of a better approach? A faster approach?

Thank you.

Derek Harmon · Mar 6, 2004

DMan said:
Public XmlDocument ProcessXML( XmlDocument xmlData )
{
XmlNode originalXML = xmlData.Clone();

: :
XmlDocument workingXML = xmlData.Clone( ) as XmlDocument;

try
{
// Process XML data here...

: :
// Process workingXML here.

}
catch
{
// if ANYTHING goes wrong with XML processing
// revert back to the original.
xmlData.LoadXml(originalXML.OuterXml);
return xmlData;

: :
return xmlData; // pristine document

}

return xmlData;

: :
return workingXML; // processed document

: :

Can anyone think of a better approach? A faster approach?

Save time by not taking these two expensive steps:

1) converting the node-tree of original XML to string text (OuterXml
is a string),
2) converting that string text back into a node-tree.

Instead, shift your perspective on what the XmlDocument is that you
have cloned. It's not the back-up; it is the work-in-progress. The
serializing to/from string can be eliminated entirely by just choosing
which document to return.

The fact that Clone( ) is virtual (so the upcast is safe as XmlDocument's
Clone( ) returns an XmlNode that is, in fact, an XmlDocument) may
have been responsible for obscuring this approach.

Derek Harmon

hOSAM · Mar 6, 2004

Generally the XmlTextReader is faster but bit more complicated. Now if you
wish to avoid catching the thrown exception, you can filter the caught
exception to catch certain types of them, or you can have a bool variable
that indicates the exception was caused by an internal function error.

hOSAM · Mar 6, 2004

Sorry! I otally misunderstood the requirement!

Jon Skeet [C# MVP] · Mar 6, 2004

DMan said:
Need some help on how to make the following faster....

Public XmlDocument ProcessXML( XmlDocument xmlData )
{
XmlNode originalXML = xmlData.Clone();

try
{
// Process XML data here...
}
catch
{
// if ANYTHING goes wrong with XML processing
// revert back to the original.
xmlData.LoadXml(originalXML.OuterXml);
return xmlData;
}

return xmlData;
}

There it is. Without cloning and reloading in the catch block, if something
goes wrong, I get SOME xml processing, which I absolutely cannot have. For
example, if some data in the XML is invalid, I throw an exception, but I
still get back all the XML processed before that error ocurred. I need to
either have the entire XML document processed error free, or revert back to
the original (as is being done in the code) if there is any error anywhere.
Can anyone think of a better approach? A faster approach?

Why not just create a new XML document, try to load into that, and if
that fails return the old one? Why try to load it into the original at
all?

Derek Harmon · Mar 6, 2004

Jon Skeet said:
Why not just create a new XML document, try to load into that, and if
that fails return the old one? Why try to load it into the original at
all?

My observation is that it's not obvious that XmlNode's Clone( )
method is virtual, returning a polymorphic XmlNode, and that
XmlDocument IS-A XmlNode. If a programmer knows all three
facts and makes a connection between them to the problem,
then an effective solution is obvious.

On the other hand, many developers new to a class design and
driven by necessity will fallback into looking for a way to thunk
from XmlNode into the XmlDocument. The path many choose
(as being more obvious) is OuterXml.

In this specific respect, the XML DOM recommended by the
W3C is less intuitive compared to a model in which polymorphic
types are "in the developer's face." OOP languages make these
principles sublime in their subtlety, and less visible to the untrained
eye. Tools then must pick up the slack, but what tools can fit into
the developers workflow with continuity as they solve this problem?

The .NET Framework SDK documentation does well by listing
inherited members in subclass docs, and that helps (imagine
how unwieldy the docs for XmlValidatingReader or PropertyGrid
would be if the SDK didn't document inherited members).

Today, its an issue of developers having to gain experience with
the XML DOM object model, whereupon making the connections
I cite above comeabout naturally with time.

Although the learning curve will solve this for everybody in time,
it's still a good case for motivating the evolution of the object model,
language and tools as they enable more productive developers in
the future (that in turn drive greater ROI for the tasks given them
by their managers, who in turn have larger budgets with which to
purchase new tools).

Derek Harmon

Faster XML Processing...

DMan

Derek Harmon

hOSAM

hOSAM

Jon Skeet [C# MVP]

Derek Harmon