Subquery to pair off line items

  • Thread starter Thread starter Guest
  • Start date Start date
G

Guest

Hi there, I'm doing a reconcilation between two database journals and I need
to be able to compare records within the same table - ie if two entries have
the same ID and are within $5 of each other, I can create a field to "pair
off" the records, and later delete them. Anyone know how to compare values in
rows.
Thanks
 
Is there a primary key for the rows? Obviously ID is not the primary key
since you have duplicates?

Assumption:
You have a primary key on the rows of the table.

SELECT *
FROM TABLE
WHERE Table.PrimaryKey IN
(SELECT T1.PrimaryKey
FROM Table as T1 Inner Join Table as T2
ON T1.ID = T2.ID
WHERE Abs(T1.Amount-T2.Amount) <=5
AND T1.PrimaryKey <> T2.PrimaryKey)
Order by Table.ID
 
Thanks guys.
Both of your queries worked very well. They only show one side of the
matching leg (vertically) as the tables are shown side by side. I have a to
run an update query to add a match status on the original table and doing a
union with
SELECT b.*, a.* ...
seemed to give me both sides of the records. Prob not the prettiest way to
do it but thanks again.

Frank
 
Michel,
I think I spoke too soon. the problem is that this query will give me
multiple instances of the of the same primary key if that record matches more
than one other record. What I was hoping for was that once the record finds a
macth, both the records are flagged and no longer used in the
population/calculation. How could I limit my results to one match.
Thanks again
 
Here's an example of waht I mean.
It would be great to pair off the first line, mark the 2 IDs (174038, 58497
as a pair off) and move to the next record.

a.PK a.ID a.Amount b.PK b.ID b.Amount
174038 271887L 60416.67187 58497 271887L -60416.67
174038 271887L 60416.67187 124368 271887L -60416.67
174038 271887L 60416.67187 28631 271887L -60416.67
174038 271887L 60416.67187 102247 271887L -60416.67

Thanks again for any additional help.
 
John,
Thanks for the code. I tried your method and was wondering what would happen
if you had 3 records that matched the criteria. Would the first 2 that match
be paired off and the 3rd be left off the query -- this is what I am looking
for. Since the query doesn't show the matching record its hard to tell.
Thanks again for all your help.
Frank
 
If the query works the way I think it would work, all three records would
show up in the result.

You could try the following to show you the combinations of records. I'm
not at all sure it will work.

SELECT *
FROM Table as T1 Inner Join Table as T2
ON T1.ID = T2.ID
WHERE Abs(T1.Amount-T2.Amount) <=5
AND T1.PrimaryKey < T2.PrimaryKey
 
Hi,

You can add a GROUP BY (the primary key of the first table) and a COUNT(*)
to know if just one match, or more, are possible.

SELECT a.pk, MAX(b.pk), COUNT(*)
FROM ...
GROUP BY a.pk



IF more than one match is possible, for many original records, you need an
assignation algorithm and in general, I would not use SQL to write it. The
possible algorithm are the Simplex method, or, in this case, the graph
theory about finding the maximum flow.

Graph Theory based method: Basically, at the left, draw a single node, the
source, S. Then, to its right, in a single column, draws all the records (as
node for the graph), on a common vertical. Let us call these nodes the A
nodes. Link the Source, S, to each A node, with a link of capacity of 1.
Next, a little bit to the right, draw the records, again, in another column,
that will be the B nodes (in the same order, downward, that for the A
nodes). Link each A node to each B node but only if the difference in price
is less than 5. Do not link two nodes one the same line, though. Do not link
a A node to a B node if the B node is upper up than the A node. Again, each
link has a capacity of 1. Finally, to complete the graph, add a single sink
node, T (terminal) and link each B node to T, each link with a capacity of
1. Now, starts the pump, and, from S, pump as much water as you can, but
note that the pipes have a finite capacity. Once you have your maximum flow
from S to T, if you look at the pipes with water in them between A nodes and
B nodes, those pipes filled with water ARE your assignations. Indeed, each
node A would get just ONE pipe, between A and B, filled with water, because
although if many pipes get out of some A-node, only one pipe get into it,
from the source, and, sure, the amount of water is "preserved" (amount out
= amount in), which is automatically taken into account by the Max Flow
Algorithm. Also, since all coefficients are 1, no fractional answer would
naturally occur. So, we have the graph, just left to you to apply the Max
Flow Algorithm, or the Simplex method, if you prefer (more general, and
probably faster too, at least, than Ford-Fulkerson). But that is not job
done with SQL, in general.






Hoping it may help,
Vanderghast, Access MVP
 
Michel,
Thanks for your detailed response (I actually drew my nodes and pipes!).
Would you know where to get some source code to get me started. Still
searching the web for something I can use.
Regards,
F
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads

Subqueries 3
Subquery help 7
Subqueries 4
Subquery 2
"Paired" records 3
Subquery not integrating to query to display previous sum of quantities 3
Subquery Help 7
Dates Subquery 1

Back
Top