XmlTextReader Hack

  • Thread starter Thread starter Guest
  • Start date Start date
G

Guest

I am trying to use an XmlTextReader to retrieve data. I need to use an
XmlTextReader because it is faster than using an XmlDocument.

I have found an inelegant way of retrieving each item's title and
description using string methods. See my code below. I would like to know how
I can retrieve the <title> and <description> nodes more directly using the
methods of the XmlTextReader.

My xml file looks somthing like:
<items>
<item><title>Title 1</title><description>Description 1</description></item>
<item><title>Title 2</title><description>Description 2</description></item>
</items>

My Hack:
XmlTextReader xmlTextReader = new XmlTextReader(myXmlUrl);

while(xmlTextReader.Read())
{
if(xmlTextReader.NodeType == XmlNodeType.Element && xmlTextReader.Name ==
"item" )
{
string itemContent = xmlTextReader.ReadOuterXml();
int headlineStart = itemContent.IndexOf("<title>") + "<title>".Length;
headlineLength = itemContent.IndexOf("</title>") - headlineStart;
descriptStart = itemContent.IndexOf("<description>") +
"<description>".Length;
int descriptLength = itemContent.IndexOf("</description>") - descriptStart;
string headline = itemContent.Substring(headlineStart,
headlineLength);
string description = itemContent.Substring(descriptStart, descriptLength);

Response.Write(headline + "<br>" + description + "<br><br>");
}

Many thanks,

CR
 
String manipulation here is a bit messy, but works (I guess);

Personally, I use a bespoke XmlReader-based approach for this that uses a
custom iterator (so only has one "row" in memory at once); I tell it the
name of the parent nodes ("item" in this case) and each *direct* child
("title" and "description"), and it returns each row as an array matching
what I passed it. It copes with parents at different levels, children in any
order, and elements and attributes (children) - e.g.

foreach(string[] fields in reader.ReadElements("item", "title",
"description")) {
// title is fields[0], description is fields[1]
}

I can post / e-mail the source if you are interested, but it is a bit
long... but it shows a different way to do it (without the strings).

Or you can do it manually in about 20 ways...

Let me know if you want me to post it

Marc
 
Hi Marc, thanks for your help.

When i tried what you suggested though, i get the error,
"'System.Xml.XmlTextReader' does not contain a definition for 'ReadElements'"

Are you using an XmlTextReader?

I've sent you an email to your posted address.
 
no; this is (as stated) using a bespoke class - SimpleXmlReader; copied
below - use it something like:

using(SimpleXmlReader reader = new SimpleXmlReader({your chosen ctor})) {
// the code from before
}

Let me know if you have any problems,

Marc


using System;
using System.Xml;
using System.IO;
using System.Collections.Generic;
using System.Text;

namespace Your.Namespace.Xml {
/// <summary>
/// Provides a simple but efficient way of reading typical
/// Xml structures with named (field) attributes or elements
<b>immediately</b> underneath
/// named (row) elements
/// </summary>
public sealed class SimpleXmlReader : IDisposable {
private static Dictionary<string, int> GetDictionary(string[]
nodeNames) {
int index = 0;
Dictionary<string, int> requiredValues = new Dictionary<string,
int>(nodeNames.Length);
foreach (string nodeName in nodeNames) {
if (!string.IsNullOrEmpty(nodeName)) {
// deliberately breaks if duplicated key requested;
value represents
// position in out array
requiredValues.Add(nodeName, index++);
}
}
return requiredValues;
}
/// <summary>
/// Returns the cited node values underneath the current element in
the reader
/// </summary>
/// <param name="reader">The reader to investigate</param>
/// <param name="nodeNames">The fields required (prefix attributes
with @)</param>
/// <returns>Array of strings, with each value in the corresponding
position to the requested fields</returns>
public static string[] ReadValues(XmlReader reader, params string[]
nodeNames) {
return ReadValues(reader, GetDictionary(nodeNames));
}
private static string[] ReadValues(XmlReader reader,
Dictionary<string, int> requiredValues) {
string[] result = new string[requiredValues.Count];
string name;
int index, targetDepth = reader.Depth + 1;
bool isEmpty = reader.IsEmptyElement;
if(reader.HasAttributes) {
if (reader.MoveToFirstAttribute()) { // read attributes
do {
name = "@" + reader.Name;
if (requiredValues.TryGetValue(name, out index) &&
result[index] == null) {
result[index] = reader.Value;
}
} while (reader.MoveToNextAttribute());
}
}
if (!isEmpty && reader.Read()) {// read sub-elements (note this
progresses the cursor)
while (true) {
int currentDepth = reader.Depth;
if (reader.Depth < targetDepth) {
break; // too shallow: reached end of element
} else if (currentDepth > targetDepth) { // too deep:
only interested in first descendents
reader.Skip();
} else if (reader.NodeType == XmlNodeType.Element) { //
right depth; is it an element?
name = reader.Name;
if (requiredValues.TryGetValue(name, out index) &&
result[index] == null) {
result[index] =
reader.ReadElementContentAsString(); // progresses cursor
} else { // not interested
reader.Skip(); // progresses cursor
}
} else if (!reader.Read()) { // progresses cursor
break; // end of the xml (somehow!)
}
}
}
return result;
}

private Stream _stream;
private readonly bool _leaveStreamOpen, _leaveReaderOpen;

/// <summary>
/// Create a SimpleXmlReader from a string
/// </summary>
/// <param name="xmlString">The xml to load</param>
/// <returns>A new SimpleXmlReader object</returns>
public static SimpleXmlReader Create(string xmlString) {
return new SimpleXmlReader(Encoding.UTF8.GetBytes(xmlString));
}

public SimpleXmlReader(string path) : this(new FileStream(path,
FileMode.Open), false) { }
public SimpleXmlReader(byte[] data) : this(new MemoryStream(data),
false) { }
public SimpleXmlReader(Stream stream) : this(stream, false) { }
public SimpleXmlReader(Stream stream, bool leaveOpen) {
_leaveStreamOpen = leaveOpen;
_leaveReaderOpen = false;
_stream = stream;
}
public SimpleXmlReader(XmlReader reader) : this(reader, false) { }
public SimpleXmlReader(XmlReader reader, bool leaveOpen) {
_leaveStreamOpen = false;
_leaveReaderOpen = leaveOpen;
_reader = reader;
}


/// <summary>
/// Reads the stream, returning an XmlDocument
/// </summary>
/// <returns>The XmlDocument of the data</returns>
/// <remarks>This will read to the end of the stream, and
/// cannot be repeated</remarks>
public XmlDocument GetDocument() {
XmlDocument doc = new XmlDocument();
if (_stream != null)
doc.Load(_stream);
else if (_reader != null)
doc.Load(_reader);
else
throw new InvalidOperationException("No stream/reader to
load from");
return doc;
}

private XmlReader _reader;
public XmlReader Reader {
get {
if (_reader == null) {
XmlReaderSettings readerSettings = new
XmlReaderSettings();
readerSettings.CloseInput = !_leaveStreamOpen;
readerSettings.IgnoreComments = true;
_reader = XmlReader.Create(_stream, readerSettings);
}
return _reader;
}
}
/// <summary>
/// Searches through the xml (forwards only) looking for the cited
row-element;
/// for each such found, the requested fields are returned to the
caller
/// </summary>
/// <param name="rowElementName">The element to search for (at any
level) </param>
/// <param name="nodeNames">The fields required (prefix attributes
with @)</param>
/// <returns></returns>
/// <remarks>Note that this searches the xml *while* enumerating -
so only one row
/// is ever in memory at once; this makes for highly efficient
processing</remarks>
public System.Collections.Generic.IEnumerable<string[]>
ReadElements(string rowElementName, params string[] valueNodeNames) {
XmlReader reader = Reader; // creates if not already there
Dictionary<string, int> requiredValues =
GetDictionary(valueNodeNames);
while (reader.ReadToFollowing(rowElementName)) {
yield return ReadValues(reader, requiredValues);
}

}

public void Close() {
if (_reader != null) {
if (!_leaveReaderOpen) {
_reader.Close();
}
_reader = null;
}
if (_stream != null) {
if (!_leaveStreamOpen) {
_stream.Close();
_stream.Dispose();
}
_stream = null;
}
}
public void Dispose() {
Close();
}
}
}
 
thanks Marc, but this is too complex for what I need.
All I want to do is just read the values from the title and description
nodes in my xml file using XmlTextReader methods.

How can i do that? -- thanks

<items>
<item><title>Title 1</title><description>Description 1</description></item>
<item><title>Title 2</title><description>Description 2</description></item>
</items>
 
Back
Top