XmlTextReader Hack

G

Guest

I am trying to use an XmlTextReader to retrieve data. I need to use an
XmlTextReader because it is faster than using an XmlDocument.

I have found an inelegant way of retrieving each item's title and
description using string methods. See my code below. I would like to know how
I can retrieve the <title> and <description> nodes more directly using the
methods of the XmlTextReader.

My xml file looks somthing like:
<items>
<item><title>Title 1</title><description>Description 1</description></item>
<item><title>Title 2</title><description>Description 2</description></item>
</items>

My Hack:
XmlTextReader xmlTextReader = new XmlTextReader(myXmlUrl);

while(xmlTextReader.Read())
{
if(xmlTextReader.NodeType == XmlNodeType.Element && xmlTextReader.Name ==
"item" )
{
string itemContent = xmlTextReader.ReadOuterXml();
int headlineStart = itemContent.IndexOf("<title>") + "<title>".Length;
headlineLength = itemContent.IndexOf("</title>") - headlineStart;
descriptStart = itemContent.IndexOf("<description>") +
"<description>".Length;
int descriptLength = itemContent.IndexOf("</description>") - descriptStart;
string headline = itemContent.Substring(headlineStart,
headlineLength);
string description = itemContent.Substring(descriptStart, descriptLength);

Response.Write(headline + "<br>" + description + "<br><br>");
}

Many thanks,

CR
 
M

Marc Gravell

String manipulation here is a bit messy, but works (I guess);

Personally, I use a bespoke XmlReader-based approach for this that uses a
custom iterator (so only has one "row" in memory at once); I tell it the
name of the parent nodes ("item" in this case) and each *direct* child
("title" and "description"), and it returns each row as an array matching
what I passed it. It copes with parents at different levels, children in any
order, and elements and attributes (children) - e.g.

foreach(string[] fields in reader.ReadElements("item", "title",
"description")) {
// title is fields[0], description is fields[1]
}

I can post / e-mail the source if you are interested, but it is a bit
long... but it shows a different way to do it (without the strings).

Or you can do it manually in about 20 ways...

Let me know if you want me to post it

Marc
 
G

Guest

Hi Marc, thanks for your help.

When i tried what you suggested though, i get the error,
"'System.Xml.XmlTextReader' does not contain a definition for 'ReadElements'"

Are you using an XmlTextReader?

I've sent you an email to your posted address.
 
M

Marc Gravell

no; this is (as stated) using a bespoke class - SimpleXmlReader; copied
below - use it something like:

using(SimpleXmlReader reader = new SimpleXmlReader({your chosen ctor})) {
// the code from before
}

Let me know if you have any problems,

Marc


using System;
using System.Xml;
using System.IO;
using System.Collections.Generic;
using System.Text;

namespace Your.Namespace.Xml {
/// <summary>
/// Provides a simple but efficient way of reading typical
/// Xml structures with named (field) attributes or elements
<b>immediately</b> underneath
/// named (row) elements
/// </summary>
public sealed class SimpleXmlReader : IDisposable {
private static Dictionary<string, int> GetDictionary(string[]
nodeNames) {
int index = 0;
Dictionary<string, int> requiredValues = new Dictionary<string,
int>(nodeNames.Length);
foreach (string nodeName in nodeNames) {
if (!string.IsNullOrEmpty(nodeName)) {
// deliberately breaks if duplicated key requested;
value represents
// position in out array
requiredValues.Add(nodeName, index++);
}
}
return requiredValues;
}
/// <summary>
/// Returns the cited node values underneath the current element in
the reader
/// </summary>
/// <param name="reader">The reader to investigate</param>
/// <param name="nodeNames">The fields required (prefix attributes
with @)</param>
/// <returns>Array of strings, with each value in the corresponding
position to the requested fields</returns>
public static string[] ReadValues(XmlReader reader, params string[]
nodeNames) {
return ReadValues(reader, GetDictionary(nodeNames));
}
private static string[] ReadValues(XmlReader reader,
Dictionary<string, int> requiredValues) {
string[] result = new string[requiredValues.Count];
string name;
int index, targetDepth = reader.Depth + 1;
bool isEmpty = reader.IsEmptyElement;
if(reader.HasAttributes) {
if (reader.MoveToFirstAttribute()) { // read attributes
do {
name = "@" + reader.Name;
if (requiredValues.TryGetValue(name, out index) &&
result[index] == null) {
result[index] = reader.Value;
}
} while (reader.MoveToNextAttribute());
}
}
if (!isEmpty && reader.Read()) {// read sub-elements (note this
progresses the cursor)
while (true) {
int currentDepth = reader.Depth;
if (reader.Depth < targetDepth) {
break; // too shallow: reached end of element
} else if (currentDepth > targetDepth) { // too deep:
only interested in first descendents
reader.Skip();
} else if (reader.NodeType == XmlNodeType.Element) { //
right depth; is it an element?
name = reader.Name;
if (requiredValues.TryGetValue(name, out index) &&
result[index] == null) {
result[index] =
reader.ReadElementContentAsString(); // progresses cursor
} else { // not interested
reader.Skip(); // progresses cursor
}
} else if (!reader.Read()) { // progresses cursor
break; // end of the xml (somehow!)
}
}
}
return result;
}

private Stream _stream;
private readonly bool _leaveStreamOpen, _leaveReaderOpen;

/// <summary>
/// Create a SimpleXmlReader from a string
/// </summary>
/// <param name="xmlString">The xml to load</param>
/// <returns>A new SimpleXmlReader object</returns>
public static SimpleXmlReader Create(string xmlString) {
return new SimpleXmlReader(Encoding.UTF8.GetBytes(xmlString));
}

public SimpleXmlReader(string path) : this(new FileStream(path,
FileMode.Open), false) { }
public SimpleXmlReader(byte[] data) : this(new MemoryStream(data),
false) { }
public SimpleXmlReader(Stream stream) : this(stream, false) { }
public SimpleXmlReader(Stream stream, bool leaveOpen) {
_leaveStreamOpen = leaveOpen;
_leaveReaderOpen = false;
_stream = stream;
}
public SimpleXmlReader(XmlReader reader) : this(reader, false) { }
public SimpleXmlReader(XmlReader reader, bool leaveOpen) {
_leaveStreamOpen = false;
_leaveReaderOpen = leaveOpen;
_reader = reader;
}


/// <summary>
/// Reads the stream, returning an XmlDocument
/// </summary>
/// <returns>The XmlDocument of the data</returns>
/// <remarks>This will read to the end of the stream, and
/// cannot be repeated</remarks>
public XmlDocument GetDocument() {
XmlDocument doc = new XmlDocument();
if (_stream != null)
doc.Load(_stream);
else if (_reader != null)
doc.Load(_reader);
else
throw new InvalidOperationException("No stream/reader to
load from");
return doc;
}

private XmlReader _reader;
public XmlReader Reader {
get {
if (_reader == null) {
XmlReaderSettings readerSettings = new
XmlReaderSettings();
readerSettings.CloseInput = !_leaveStreamOpen;
readerSettings.IgnoreComments = true;
_reader = XmlReader.Create(_stream, readerSettings);
}
return _reader;
}
}
/// <summary>
/// Searches through the xml (forwards only) looking for the cited
row-element;
/// for each such found, the requested fields are returned to the
caller
/// </summary>
/// <param name="rowElementName">The element to search for (at any
level) </param>
/// <param name="nodeNames">The fields required (prefix attributes
with @)</param>
/// <returns></returns>
/// <remarks>Note that this searches the xml *while* enumerating -
so only one row
/// is ever in memory at once; this makes for highly efficient
processing</remarks>
public System.Collections.Generic.IEnumerable<string[]>
ReadElements(string rowElementName, params string[] valueNodeNames) {
XmlReader reader = Reader; // creates if not already there
Dictionary<string, int> requiredValues =
GetDictionary(valueNodeNames);
while (reader.ReadToFollowing(rowElementName)) {
yield return ReadValues(reader, requiredValues);
}

}

public void Close() {
if (_reader != null) {
if (!_leaveReaderOpen) {
_reader.Close();
}
_reader = null;
}
if (_stream != null) {
if (!_leaveStreamOpen) {
_stream.Close();
_stream.Dispose();
}
_stream = null;
}
}
public void Dispose() {
Close();
}
}
}
 
G

Guest

thanks Marc, but this is too complex for what I need.
All I want to do is just read the values from the title and description
nodes in my xml file using XmlTextReader methods.

How can i do that? -- thanks

<items>
<item><title>Title 1</title><description>Description 1</description></item>
<item><title>Title 2</title><description>Description 2</description></item>
</items>
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top