The best way to replace chars as file is loaded into a parser?

  • Thread starter Thread starter Guest
  • Start date Start date
G

Guest

What's the most efficient way to replace characters in an XML document before
it is loaded into a parser? Chars I'd want to replace are in attributes and
there can be N attributes, also let's assume that I'm not familiar w/ the
structure of the XML so that I'll have to read/replace/load via stream.
Is this possible
OpenStream->Replace char(s)->To XML parser ? If so then how?
Art
 
Art said:
What's the most efficient way to replace characters in an XML document before
it is loaded into a parser? Chars I'd want to replace are in attributes and
there can be N attributes, also let's assume that I'm not familiar w/ the
structure of the XML so that I'll have to read/replace/load via stream.
Is this possible
OpenStream->Replace char(s)->To XML parser ? If so then how?

You could write your own TextReader which read from another text reader
and made any replacements. If you only want to replace characters which
are in attributes, however, that makes it a good deal harder. I would
suggest loading it into a DOM tree, then going through each node and
replacing the attributes in each node.
 
Jon
What if you are not familirar w/ the XML structure? What if documents are
sent to you using different schema. I mean - it would probably be possible to
code generic solution, which (starting at the root node) recursively asks
for children and attributes and then changes them when unwanted character
occurs but that would probably be an overkill.
Reading XML as a stream you dont' have to worry about what's a node, element
or an attribute. You'd just take what's at at cursor's position and replace
as necessary.
But then how do you load examined stream into DOM?
Thanks for your help!
Art
 
Art said:
What if you are not familirar w/ the XML structure? What if documents are
sent to you using different schema. I mean - it would probably be possible to
code generic solution, which (starting at the root node) recursively asks
for children and attributes and then changes them when unwanted character
occurs but that would probably be an overkill.
Reading XML as a stream you dont' have to worry about what's a node, element
or an attribute. You'd just take what's at at cursor's position and replace
as necessary.

I'm not entirely sure what you mean by "reading XML as a stream". If
you *only* want to do a replacement within attributes, you *do* need to
know where you are.
But then how do you load examined stream into DOM?

What I was proposing was a class derived from TextReader which took
another TextReader, and when it was asked for some characters, read
some from the original and made any replacements. You could use that to
build the DOM in the normal way.

However, if you're going to load it into a DOM anyway, why not just
perform the replacement on the in-memory version after it's loaded?
 
Back
Top