XML ID ...

S

shapper

Hello,

I am creating a node in a XML file as follows:

private XDocument _context;
_context = XDocument.Load(String.Concat(path, "Slides.xml"),
LoadOptions.SetBaseUri);
// ...
XElement _slide = new XElement("Slide",
new XElement("Id", ???),
new XElement("Name", slide.Name),
new XElement("Published", slide.Published.ToString()),
new XElement("Url", slide.Url)
);
_context.Root.Add(_slide);
_context.Save(new Uri(_context.BaseUri).LocalPath);

I would like the Id to be an incremental unique INT (identity) like as
I use it SQL database.

What is the best way to get the id to be used?
(I am using Linq to ...)

Thanks,
Miguel
 
P

Peter Duniho

shapper said:
Hello,

I am creating a node in a XML file as follows:

private XDocument _context;
_context = XDocument.Load(String.Concat(path, "Slides.xml"),
LoadOptions.SetBaseUri);
// ...
XElement _slide = new XElement("Slide",
new XElement("Id", ???),
new XElement("Name", slide.Name),
new XElement("Published", slide.Published.ToString()),
new XElement("Url", slide.Url)
);
_context.Root.Add(_slide);
_context.Save(new Uri(_context.BaseUri).LocalPath);

I would like the Id to be an incremental unique INT (identity) like as
I use it SQL database.

What is the best way to get the id to be used?

What do you mean? Does the fact that the id will eventually be stored
in an XML document have any bearing on the question at all?

The obvious answer is: maintain a variable, initialized to 0 and
incremented by one for each "Slide" element you create, and which you
use as the value for the "Id" element in the "Slide" element.

But, presumably you would have thought of that. So what is your _real_
question? Why did you show all that XML stuff? What is it about a
monotonically incrementing variable that doesn't work for your situation?

Pete
 
S

shapper

What do you mean?  Does the fact that the id will eventually be stored
in an XML document have any bearing on the question at all?

The obvious answer is: maintain a variable, initialized to 0 and
incremented by one for each "Slide" element you create, and which you
use as the value for the "Id" element in the "Slide" element.

But, presumably you would have thought of that.  So what is your _real_
question?  Why did you show all that XML stuff?  What is it about a
monotonically incrementing variable that doesn't work for your situation?

Pete

My real question is:

Consider I have already 100 slides in the XML file.
My idea is to, before insert the new one, to query the XML file and
get the highest ID value.
And use for the new slide ID + 1 as its ID.

My question is:
Since I am querying the XML to get the highest ID what would be the
best way to do that in terms of performance or if there is a better
way.

I am not creating all slides at once. So I can't increment it ...

Thank you,
Miguel
 
R

Registered User

My real question is:

Consider I have already 100 slides in the XML file.
My idea is to, before insert the new one, to query the XML file and
get the highest ID value.
And use for the new slide ID + 1 as its ID.

My question is:
Since I am querying the XML to get the highest ID what would be the
best way to do that in terms of performance or if there is a better
way.

I am not creating all slides at once. So I can't increment it ...
The last ID used (highest ID) could be an attribute of root node. That
value can be incremented as child nodes are added.

regards
A.G.
 
P

Peter Duniho

shapper said:
My real question is:

Consider I have already 100 slides in the XML file.
My idea is to, before insert the new one, to query the XML file and
get the highest ID value.
And use for the new slide ID + 1 as its ID.

My question is:
Since I am querying the XML to get the highest ID what would be the
best way to do that in terms of performance or if there is a better
way.

Hard to say. I'm not convinced performance is an issue.

After all, to add new elements to your XML, you have to rewrite the XML
file. Surely that's a lot more expensive than enumerating the existing
elements in the file, and in any case as far as I can tell from your
code example you are reading the entire document into memory to start
with anyway (so the most expensive part of enumerating the elements is
already done).

If you like, you can follow "A.G."'s suggestion and maintain the current
index count in the document. But personally, it seems to me that it
would make more sense to just build the document you intend to write
out, and then enumerate all your "Slide" elements, setting the "Id"
explicitly based on a newly-initialized counter.

Note that that only works if you don't need the "Id" values to be
preserved between file versions. If you need a given "Slide" element to
have the same "Id" from one file version to the next, then you'll have
to either do what "A.G." suggests, or enumerate all of the "Slide"
elements to find the highest present "Id" value and start your counting
just past that.

Also note that if you do need the "Id" values to be preserved between
file versions, eventually you run into the possibility that you run out
of integers. I mean, you have that possibility in any case, but it's a
lot more likely to happen if you are constantly increasing the "Id"
value for new "Slide" elements each time some new "Slide" element is
added to file, even if others are removed.

Even there, the user would have to add 2 billion (or so) "Slide"
elements (but not necessarily all present in the file at the same time)
before running into a problem. But it _could_ happen. A solution that
doesn't break until you actually have 2 billion elements _present_ in
the file is going to be much harder to make fail. :)

Pete
 
S

shapper

Hard to say.  I'm not convinced performance is an issue.

After all, to add new elements to your XML, you have to rewrite the XML
file.  Surely that's a lot more expensive than enumerating the existing
elements in the file, and in any case as far as I can tell from your
code example you are reading the entire document into memory to start
with anyway (so the most expensive part of enumerating the elements is
already done).

That is an issue I have.
Is there a way to create, delete, update a node without loading the
entire file?
Loading the entire file isn't to expensive in terms of performance?

If you like, you can follow "A.G."'s suggestion and maintain the current
index count in the document.  But personally, it seems to me that it
would make more sense to just build the document you intend to write
out, and then enumerate all your "Slide" elements, setting the "Id"
explicitly based on a newly-initialized counter.

I need to preserve the same ID's as they are related to something
else.
That said is there a way to get the highest ID without loading the
file into memory?

Thanks,
Miguel
 
P

Peter Duniho

shapper said:
That is an issue I have.
Is there a way to create, delete, update a node without loading the
entire file?

There's always a way. :)
Loading the entire file isn't to expensive in terms of performance?

How large is the file? Most XML files are not large enough to cause a
problem, and the faster the disk, the more physical RAM you have, and
especially if you're using 64-bit Windows, you can get very large
amounts of data in memory all at once without significant problems.

In fact, if you're able to do everything in-memory, it can actually be
faster and easier to code. After all, at _some_ point, to change an XML
file, you have to read all of the existing data and write all of the new
data. Whether it's all in memory at the same time is the only thing
that's optional.
I need to preserve the same ID's as they are related to something else.
That said is there a way to get the highest ID without loading the
file into memory?

Sure. You can use the XmlReader class to read through the file without
loading the entire thing into memory at once. It's just the "document
object model" classes like XmlDocument and XDocument where you have the
whole structure in-memory at once.

That said, XML is a recursive (nested) data structure. So an XML
document that has a lot of depth can still require a lot of memory
during processing. But most XML documents are much longer than they are
deep and so a linear parsing provided by XmlReader works fine and uses
much less memory than a DOM approach.

But IMHO DOM is a much more convenient way to do it, and so unless you
expect to have to deal with files that are hundreds of megabytes or
larger, I'd say DOM is the way to go.

If it's really important, write it both ways, measure performance, and
figure out which is faster and more appropriate for your needs.

Pete
 
S

shapper

There's always a way.  :)


How large is the file?  Most XML files are not large enough to cause a
problem, and the faster the disk, the more physical RAM you have, and
especially if you're using 64-bit Windows, you can get very large
amounts of data in memory all at once without significant problems.

In fact, if you're able to do everything in-memory, it can actually be
faster and easier to code.  After all, at _some_ point, to change an XML
file, you have to read all of the existing data and write all of the new
data.  Whether it's all in memory at the same time is the only thing
that's optional.



Sure.  You can use the XmlReader class to read through the file without
loading the entire thing into memory at once.  It's just the "document
object model" classes like XmlDocument and XDocument where you have the
whole structure in-memory at once.

That said, XML is a recursive (nested) data structure.  So an XML
document that has a lot of depth can still require a lot of memory
during processing.  But most XML documents are much longer than they are
deep and so a linear parsing provided by XmlReader works fine and uses
much less memory than a DOM approach.

But IMHO DOM is a much more convenient way to do it, and so unless you
expect to have to deal with files that are hundreds of megabytes or
larger, I'd say DOM is the way to go.

If it's really important, write it both ways, measure performance, and
figure out which is faster and more appropriate for your needs.

Pete

Hi,

In this case I don't expect more than 100 MB and that is already going
to the extreme and it is only on one file.

Basically, most XML files will be around 2MB to 10MB. And only one can
be around 100MB but it will probably be around 20MB to 40MB.

So I think XDocument is the way to go ...

Thank You,
Miguel
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top