read XML and add to it with XMLTextReader / XMLTextWriter

roger_27 · Dec 10, 2009

I'm having quite a problem here, I need to take an XML file, read it and
write it out using XMLTextWriter to a differet file as it reads. After it's
done reading, it will add some stuff to the end.

I want to use xmltextreader/xmltextwriter because it's better with memory
than XMLDocument.

here's what I have, maybe someone out there will help me out:

//copy source to a temp spot.
File.Copy("old.txt", "tmp.txt", true);

XmlTextReader xds = new XmlTextReader("tmp.txt");

//actual XML File setup...
xmlwrite = new XmlTextWriter("new.txt", null);

//this will be indented text
xmlwrite.Formatting = System.Xml.Formatting.Indented;

//makes the xml version =1.0 a separate attribute...
//i dont this true or false really matters...
xmlwrite.WriteStartDocument(false);

while (xds.Read())
{
xds.MoveToElement();

if (xds.NodeType == XmlNodeType.Element)
{
xmlwrite.WriteElementString(xds.Name, xds.Value);
}
}

the Source XML looks like this:

<?xml version="1.0" standalone="no"?>
<Statements>
<Customer>
<NAME>William Jones</NAME>
<CUSNAME>William Jones</CUSNAME>
<CUSCO />
<CUSADR1>787 McFall</CUSADR1>
<CUSADR2 />
<CUSCSZ>Los Angeles, CA 95336-6941</CUSCSZ>
</Customer>
</Statements>

what's happening is it is reading the ELEMENT TEXT as an element.
for example, the first Read() after the xml declaration (and "<statements>",
and "<customer>") has

xds.Name = "NAME" and a value of blank.
xmlnodetype is Element

the next Read() reads
xds.Name = "" and a xds.Value = "William Jones"
xmlnodetype is Text

it seems to be reading the element as it's own thing, and the text as its
own thing.

can't I read the element and its value together?? any help is appreciated
here.

thanks.

Peter Duniho · Dec 11, 2009

roger_27 said:
[...]
can't I read the element and its value together?? any help is appreciated
here.

No, you can't. Not with XmlReader.

You say you specifically want to use XmlReader to reduce memory usage.
But you need to consider _why_ doing so reduces memory usage. In
particular, it reduces memory usage because XmlReader parses the file
one syntactical component at a time. When it reads an "element", it is
literally stopping as soon as it's read the opening tag for a given element.

You'll find that you also get a separate read operation for the close of
the element too, in addition to individual reads for each individual
component in between.

If you want an object-model view of the XML, you have to use the
object-model API (either XmlDocument or XDocument, for example). If you
don't want to use the object-model API, then you can't get an
object-model view of the XML.

Pete

roger_27 · Dec 17, 2009

Peter Duniho said:
No, you can't. Not with XmlReader.

Figured it out with some help from the microsoft C# forums.

XMLtextreader/writer is 50 times better than XML document because it doesnt
give a System.OutOfMemory exception if the XML gets too big. as proof, with
this method I was able to write a bug, and write out 11 gigs of XML file. XML
Document errored out on me after 3 gigs of XML file.

here is the code below. hope it helps someone out there.

//rename source file to .tmp
File.Copy(filename.Replace(".txt", ".tmp"), true);

XmlTextReader xds = new XmlTextReader(filename.Replace(".txt", ".tmp"));

//actual XML File setup...
//xml writer
//XmlTextWriter xmlwrite;
xmlwrite = new XmlTextWriter(filename, null);

//this will be indented text
xmlwrite.Formatting = System.Xml.Formatting.Indented;

//xml declaration
//i dont know if true or false really matters...
xmlwrite.WriteStartDocument(false);

int currentDepth = 0;
string elem = "";
string elemVal = "";
while (xds.Read())
{

if (currentDepth < xds.Depth)
{
currentDepth += 1;
}
//now we loop through each item read and check out the node type
switch (xds.NodeType)
{
//if it's an element do this
case XmlNodeType.Element:

//if it's one of the elements that have sub elements, you need to use an IF
//statement for each one!!
if (
xds.Name.ToLower().Contains("customer") ||
xds.Name.ToLower().Contains("statement")
)
{
xmlwrite.WriteStartElement(xds.Name);
}

if (xds.Name.ToLower().Contains("itemtoprint"))
{
xmlwrite.WriteStartElement(xds.Name);
}

elem = xds.Name;
break;
//if it's XML text do this!
case XmlNodeType.Text:
elemVal = xds.Value;
break;
//if it's whitespace do this!
case XmlNodeType.Whitespace:
if (elem != "")
{
//make sure it's not one of the elements that have sub elements!
if (elem.ToLower().Contains("customer") == false &&
elem.ToLower().Contains("statement") == false &&
elem.ToLower().Contains("itemtoprint") == false
)
xmlwrite.WriteElementString(elem, "");
}
break;
//if it's an end element do this!
case XmlNodeType.EndElement:
if (elem != "")
{
//write an end element if it's one of the elements
//that have sub elements
if (
xds.Name.ToLower().Contains("itemtoprint") ||
xds.Name.ToLower().Contains("customer")
//I commented this part out, because this created an unfinished
//XML document, which I then add to later.
//||
//xds.Name.ToLower().Contains("statement")

)
{
xmlwrite.WriteEndElement();
}
else
{
xmlwrite.WriteElementString(elem, elemVal);
}
}

//include any elements that have sub elements that AREN'T the root node
if (xds.Name.ToLower().Contains("customer")
||
xds.Name.ToLower().Contains("itemtoprint")

)
{
xmlwrite.WriteEndElement();
}

currentDepth -= 1;
elem = "";
break;
}

}

//delete the temp file.
File.Delete(filename.Insert(filename.Length - 4,
FileCount.ToString()).Replace(".txt", ".tmp"));

keep in mind, this code is meant for XML with the following structure:

<Statements>
<Customer>
<name>whatever</name>
<address>whatever</address>
<itemtoprint>
<service>water service</service>
</itemtoprint>
</Customer>
<Customer>
<name>whatever2</name>
<address>whatever2</address>
<itemtoprint>
<service>water service2</service>
</itemtoprint>
</Customer>
</Statements>

so you will need to adjust the code for your own application.

Peter Duniho · Dec 17, 2009

roger_27 said:
Figured it out with some help from the microsoft C# forums.

Such as this one?

XMLtextreader/writer is 50 times better than XML document because it doesnt
give a System.OutOfMemory exception if the XML gets too big. as proof, with
this method I was able to write a bug, and write out 11 gigs of XML file. XML
Document errored out on me after 3 gigs of XML file.

We don't need proof. It's a well-known difference between using the
document object model and using a linear reader like XmlReader. I'm
surprised you were able to handle a 3GB XML file with XmlDocument
(assuming 32-bit Windows...on 64-bit it should be fine to read that much
and much more).

here is the code below. hope it helps someone out there. [...]

And note that just as I pointed out, with XmlReader you can't read the
element and its value (content) together. They have to be handled in
two separate operations.

Thanks for posting the code.

Pete

roger_27 · Dec 17, 2009

Peter Duniho said:
Such as this one?

These are microsoft newsgroups, there are actually microsoft C# forums too.

I think it's kind of redundant, and they really should just get rid of one
of them, but it's found here:

http://social.msdn.microsoft.com/Forums/en/category/visualcsharp/

I usually post my questions in both just in case one of them doesn't get an
answer.

I hope my code sample can help someone out in the future. I hate when people
find a solution, and dont post actual code to help out others. they just say
"I figured it out. I had to make an object and inherit the interface" and
you're stuck thinking "I wish he showed me code so I can make it work myself"

read XML and add to it with XMLTextReader / XMLTextWriter

roger_27

Peter Duniho

roger_27

Peter Duniho

roger_27

Ask a Question

Similar Threads