Should I use XML or not?

  • Thread starter Thread starter Alex
  • Start date Start date
A

Alex

Hello,

I am writing a word add-in in C#.
This add-in has to attach some metadata to bookmarks in the document. The metadata is stored in Word document variables.

Typical metadata can consist of a string identifier, three lists of [0 to 50] person-names each and about 5 Yes/No flags.

The problem is that I have to parse the metadata whenever the user moves the cursor onto bookmarked text so it has to be very fast, otherwise moving through the document will feel sluggish.

I was thinking of formatting the metadata using XmlTextWriter and parsing it using XmlTextReader.
However, I am not familiar with these classes and I am afraid that this approach may prove too slow on computers that are not state of the art.

The alternative is formatting and parsing the strings myself, perhaps using some combination of string.Split, string.Join, string.IndexOf and StringBuilder.

Since the metadata is only intended for the internal operation of the add-in, it can be as ugly as I want (in fact, I was thinking about obfuscating it do discourage users from messing with it but this may add another slowdown).

What would you suggest?

Thank you,
Alex.
 
hi Alex
with this size of data , using classes of XML namespace should be ok.
Mohamed M .Mahfouz
Developer Support Engineer
ITWorx on behalf of Microsoft EMEA GTSC
 
Mohamoss said:
hi Alex
with this size of data , using classes of XML namespace should be ok.

Thank you Mohamed,

However this brings me to another question:

Whenever I need to parse the XML string I create a reader instance like this:

XmlTextReader r = new XmlTextReader(myDataString, XmlNodeType.Element, null);

It seems a bit expensive to me.
Is it possible to create only one reader instance and supply it with different XML element strings?


Best wishes,
Alex.
 
Alex said:
However this brings me to another question:

Whenever I need to parse the XML string I create a reader instance like this:

XmlTextReader r = new XmlTextReader(myDataString, XmlNodeType.Element, null);

It seems a bit expensive to me.

Is this because you've tested it and found it's a bottleneck, or
because you're guessing that it's expensive? A key part of getting
systems to work efficiently is working out what to worry about and what
not to. I strongly suspect that the overhead of creating the
XmlTextReader is likely to be insignificant in your program.

I suggest you work out a reasonable speed you need this part of the
program to work at, and try it - then you'll know whether you need to
look into different approaches.
Is it possible to create only one reader instance and supply it with
different XML element strings?

Not that I know of.
 
I would suggest to learn and use those XML classes and if performance
problems appear, then optimize your code and if performance problems persist
then use the other alternative. XML is very flexible (to accommodate changes
in your data format), you have great support in the .NET Framework, and for
small data amounts as in your case it should perform fast, but you must
measure it.

--

Carlos J. Quintero

MZ-Tools 4.0: Productivity add-ins for Visual Studio .NET
You can code, design and document much faster.
http://www.mztools.com


"Alex" <[email protected]> escribió en el mensaje
Hello,

I am writing a word add-in in C#.
This add-in has to attach some metadata to bookmarks in the document. The
metadata is stored in Word document variables.

Typical metadata can consist of a string identifier, three lists of [0 to
50] person-names each and about 5 Yes/No flags.

The problem is that I have to parse the metadata whenever the user moves the
cursor onto bookmarked text so it has to be very fast, otherwise moving
through the document will feel sluggish.

I was thinking of formatting the metadata using XmlTextWriter and parsing it
using XmlTextReader.
However, I am not familiar with these classes and I am afraid that this
approach may prove too slow on computers that are not state of the art.

The alternative is formatting and parsing the strings myself, perhaps using
some combination of string.Split, string.Join, string.IndexOf and
StringBuilder.

Since the metadata is only intended for the internal operation of the
add-in, it can be as ugly as I want (in fact, I was thinking about
obfuscating it do discourage users from messing with it but this may add
another slowdown).

What would you suggest?

Thank you,
Alex.
 
Jon Skeet said:
Is this because you've tested it and found it's a bottleneck, or
because you're guessing that it's expensive? A key part of getting
systems to work efficiently is working out what to worry about and what
not to. I strongly suspect that the overhead of creating the
XmlTextReader is likely to be insignificant in your program.

I admit that my experience is based on C++ and Java and I am just starting my voyage into .NET land.
I ask these questions in order leverage the collective wisdom of this community instead of constantly reinventing the wheel.

If you know of a resource that addresses good programming practices (as well as performance issues and tradeoffs) specific to C# / .NET, please point me to it.
 
I normally take the tack that Jon did in his response: write the code,
test to see if it's a problem, fix it only if it is.

However, in this case you do have a response issue: you have a fuzzy
deadline to meet from the time the user mouses over the bookmark to the
time that your information appears.

This, coupled with the fact that XML is really a data interchange
format and is, as you pointed out, rather heavyweight, would make me
lean toward designing my own string format using separators and
string's .Split() method.

That said, be aware that whatever format you choose should be
XML-friendly. That means that you should avoid using XML markers (< and
, single and double quotes, and &) as delimiters. Since you're
inventing your own delimiting system, this shouldn't be a problem.

As well, be sure to wrap your representation in a class, so that the
rest of your application code has no idea how the information is
stored. That way you can change the representation without having to
rewrite your app. You might even choose to experiment with XML to "see
how it goes", secure in the knowledge that you can rewrite your
serialization / deserialization code in a half hour and try something
different if needs be.

However, I don't see how XML is going to buy you anything here, and it
can only cost you. Yes, in general it's better to use generic services,
and yes, in general it's better to code first and then find bottlenecks
later, but this is one case in which I see no value in using the
generic service (XML).
 
Back
Top