Split Multiple XML Documents from a Single File

M

Matt

I have a client that transmits a file to us with many XML documents enclosed. The problem is that each is a different format and may have different encodings as they contain information from many different countries. We can handle all the different documents when they are split into single files and we currently have a cluge to do this.

I was wondering if someone has a good idea to perform this same process in C#. I know I can open the file in binary mode and look at each character and find where one document ends and the next starts but I am wondering if this is the best method of doing this. Any samples or ideas are greatly appreciated.

Thanks,

Matt
 
M

Michael Nemtsev

Hello Matt,

What's wrong with the current process?
Since you xml file even in not well-formed you can't rely on XML parses -
everything you can do is to parse it manually

Is it possible to change the client side to send files separately?

M> I have a client that transmits a file to us with many XML documents
M> enclosed. The problem is that each is a different format and may
M> have different encodings as they contain information from many
M> different countries. We can handle all the different documents when
M> they are split into single files and we currently have a cluge to do
M> this.
M>
M> I was wondering if someone has a good idea to perform this same
M> process in C#. I know I can open the file in binary mode and look at
M> each character and find where one document ends and the next starts
M> but I am wondering if this is the best method of doing this. Any
M> samples or ideas are greatly appreciated.
M>
M> Thanks,
M>
M> Matt
M>
---
WBR,
Michael Nemtsev :: blog: http://spaces.msn.com/laflour

"At times one remains faithful to a cause only because its opponents do not
cease to be insipid." (c) Friedrich Nietzsch
 
M

Matt

Michael,

The current process is slow and cumbersome with a VB6 program. No the customer batches up all days work and transmits the single file to us. For some reason they feel it is cheaper to have them all together when transmitting, of course the batch does contain some 24k documents.

We want to use C# and try to make the process faster and cleaner. We know we cannot use any DOM but I was looking to the group to see if anyone knows of a way to quickly split them out but since there are different encodings you cannot look at the file as string, we tried this and it failed miserably. I know I can read each byte and look for the beginning of each XML document but I was hoping that their may be something faster and easier in .Net 2.0 that I had not run across yet to help out.

Thanks,

Matt


Hello Matt,

What's wrong with the current process?
Since you xml file even in not well-formed you can't rely on XML parses -
everything you can do is to parse it manually

Is it possible to change the client side to send files separately?

M> I have a client that transmits a file to us with many XML documents
M> enclosed. The problem is that each is a different format and may
M> have different encodings as they contain information from many
M> different countries. We can handle all the different documents when
M> they are split into single files and we currently have a cluge to do
M> this.
M>
M> I was wondering if someone has a good idea to perform this same
M> process in C#. I know I can open the file in binary mode and look at
M> each character and find where one document ends and the next starts
M> but I am wondering if this is the best method of doing this. Any
M> samples or ideas are greatly appreciated.
M>
M> Thanks,
M>
M> Matt
M>
---
WBR,
Michael Nemtsev :: blog: http://spaces.msn.com/laflour

"At times one remains faithful to a cause only because its opponents do not
cease to be insipid." (c) Friedrich Nietzsche
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top