Removing Offsetting XML Nodes

G

gregarican

I have written a C# 2005 program that pulls the results on an SQL
query into an XML output file. The XML elements represent either a
retail sales or retail refund transaction. My dilemma is I need to
remove any offsetting nodes. For example, a sale and a refund for the
same stock item should cancel themselves out. Or a sale, refund, and
another sale for the same stock item should only appear as a single
sale.

A sample snippet of the XML would look similar to this

<transaction>
<documentNumber>SALE1</documentNumber>
<stockNumber>1234</stockNumber>
</transaction>

<transaction>
<documentNumber>SALE2</documentNumber>
<stockNumber>1234</stockNumber>
</transaction>

<transaction>
<documentNumber>REFUND1</documentNumber>
<stockNumber>1234</stockNumber>
</transaction>

What I'd need to do is somehow parse through the XML nodes
(XMLNodeList.SelectNodes I'm thinking?) and remove the offsetting
nodes containing SALE1 and REFUND1 so that only the SALE2 node would
remain. Any suggestions about the easiest way to accomplish this? I
can't readily utilize LINQ because of .NET 2.0 dependencies...
 
J

James A. Fortune

Just as a guess I'm going to say: easier said than done.

As others have said, the cleanest way to solve your problem is to edit
the SQL query so that the offending nodes never get created. You want
to exclude any transactions where a SALE documentNumber has a
corresponding REFUND documentNumber. Note that you do not need to
include any of the REFUND transactions in the SQL results (those with
references to sales transactions). Assuming that you are not using
some home-brewed way to connect the REFUND1 string somehow to the
SALE1 string, a field should exist that associates the REFUND1 to the
primary key of the SALE1 transaction. For example,

Transactions (table)
ID documentNumber stockNumber referenceID
1 SALE1 1234 Null
2 SALE2 1234 Null
3 REFUND1 1234 1
4 SALE1 1235 Null
5 SALE2 1235 Null

The '1' in the referenceID field ties the refund back to the first
transaction. Then the SQL to return the transactions you want might
look something like:

SELECT A.documentNumber, A.stockNumber
FROM Transactions AS A LEFT JOIN Transactions AS B ON A.ID =
B.referenceID
WHERE B.referenceID Is Null AND A.referenceID Is Null
ORDER BY A.documentNumber, A.stockNumber;

Result:

documentNumber stockNumber
SALE1 1235
SALE2 1234
SALE2 1235

James A. Fortune
(e-mail address removed)
 
J

Jeff Johnson

Assuming that you are not using some home-brewed way to connect
the REFUND1 string somehow to the SALE1 string, a field should
exist that associates the REFUND1 to the primary key of the SALE1
transaction.

And that's an assumption I was not willing to make, hence my answer. I'd lay
odds that the DB design is lacking in this area, and that program logic will
be needed to identify the matching pairs.
 
A

Arne Vajhøj

I have written a C# 2005 program that pulls the results on an SQL
query into an XML output file. The XML elements represent either a
retail sales or retail refund transaction. My dilemma is I need to
remove any offsetting nodes. For example, a sale and a refund for the
same stock item should cancel themselves out. Or a sale, refund, and
another sale for the same stock item should only appear as a single
sale.

A sample snippet of the XML would look similar to this

<transaction>
<documentNumber>SALE1</documentNumber>
<stockNumber>1234</stockNumber>
</transaction>

<transaction>
<documentNumber>SALE2</documentNumber>
<stockNumber>1234</stockNumber>
</transaction>

<transaction>
<documentNumber>REFUND1</documentNumber>
<stockNumber>1234</stockNumber>
</transaction>

What I'd need to do is somehow parse through the XML nodes
(XMLNodeList.SelectNodes I'm thinking?) and remove the offsetting
nodes containing SALE1 and REFUND1 so that only the SALE2 node would
remain. Any suggestions about the easiest way to accomplish this? I
can't readily utilize LINQ because of .NET 2.0 dependencies...

The best advice is:
1) Follow the suggestions of doing this in SQL. If this is not easy,
then change the table structure to make it easy.
2) Drop the XML. With this type of operations XML is making
solutions more difficult.

If you want to proceed then try something along the lines of:

using System;
using System.Xml;

namespace E
{
public class Program
{
public static void Main(string[] args)
{
string s = @"<alltransactions>

<transaction>
<documentNumber>SALE1</documentNumber>
<stockNumber>1234</stockNumber>
</transaction>

<transaction>
<documentNumber>SALE2</documentNumber>
<stockNumber>1234</stockNumber>
</transaction>

<transaction>
<documentNumber>REFUND1</documentNumber>
<stockNumber>1234</stockNumber>
</transaction>

</alltransactions>";
Console.WriteLine(s);
XmlDocument doc = new XmlDocument();
doc.LoadXml(s);
//XmlNodeList net =
doc.SelectNodes("//transaction[starts-with(documentNumber,'SALE') and
count(//transaction[documentNumber=concat('REFUND',substring-after(?/documentNumber,'SALE'))])=0]");
XmlNodeList net =
doc.SelectNodes("//transaction[starts-with(documentNumber,'SALE')]");
foreach(XmlElement elm in net)
{

if(doc.SelectSingleNode(string.Format("//transaction[documentNumber='{0}']",
elm.FirstChild.FirstChild.Value.Replace("SALE", "REFUND"))) == null)
{
Console.WriteLine(elm.OuterXml);
}
}
Console.ReadKey();
}
}
}

I tried getting it done with a single SelectNodes, but XPath (at least
in 1.0 which .NET uses) does not seem to have the necessary support.

Arne
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top