Maddening Dilemma - Compare each cell within column a to each cell in column b and find unique match

P

pogster

I have been trying to find a way in which i can have excel search
through two columns to find each unique match between the columns and
somehow change the formatting (like highlighting) to fish out the
matches. The problem is that i cannot think of a way to get each cell
in column a compared to each cell in column b, rinse and repeat for
the entire column a. And i am talking about thousands of rows per
column, and the columns are not necessarily the same size. Maddening!

If anyone has any ideas that do not involve installing shareware
plugins, please let me know!

Thanks!
 
P

pogster

Tom,
A great start upon a solution, i am attempting a workaround using the
info you provided, many thanks!

Although my problem is more intricate than i let on. Realistically, i
would like to find duplicate values within the same column, but some
of the values are negative and some ar positive. I need to find out,
and highlight, the values which would cancel each other out. Hence i
need to find duplicates, except that they are not EXACT duplicates,
but one is the positive version of the number and one is the negative
version of the number. Here is an small example of such a list:

A1
10 <-matches such as this and the negative 10 below it both need to
be highlighted
-10
11
12
13 <-does not cancel and doesnt need to be highlighted
-12
-11
15
-15
-15 <-also does not cancel since one positve already canceled with one
negative and doesnt need to be highlighted

as you can see, i am attempting to highlight the corresponding
positive and negative values (the ones which would cancel each other
out) and leave the singular values unhighlighted. The problem is
even futher complicated by the fact that some values might have
multiple duplicates. There may be 10 positive number 11's and
nineteen negative number 11's, meaning i need the 10 positive versions
of the number 11 to cancel and highlight along with the 10 negative
versions of the number 11, but to leave the other 9 number 11's
unhighlighted.

I apologize for how confusing this sounds, but i would choose
confusing over hours of manual labor infused with human error anyday,
seeing as how i have thousands of rows within this one column with
many random and changing values to identify and compare.

If you or anyone can think of a solution more closely tailored to this
particular situation, i would be indebted to you forever.

In the meantime, i will work with what you provided to me Tom.

Thank you kindly.
 
P

pogster

Also, to clarify, the reason i was asking about two columns was
because i used absolute value on all of the negative numbers and put
them in a separate list next to the positive ones to find EXACT
matches. But if there is a way in which this wasnt necessary, that
would be far more ideal.

Thanks again guys!
 
P

Pete_UK

Are you allowed to sort the data, as well as taking the absolute
value? If so, then you are looking for pairs of numbers in adjacent
cells. Assuming that you have your ABS formula in column B, starting
with B1, and that you have sorted the data by column B, then put this
formula in C1:

=IF(B1=B2,"yes1","no")

and this one in C2:

=IF(AND(B2=B1,C1="yes2"),IF(B2=B3,"yes1","no"),IF(B2=B1,"yes2",IF(B2=B3,"yes1","no")))

Copy this formula down column C by double-clicking the fill icon (the
small black square in the bottom right corner of the cursor. It will
give you pairs of "yes1"/"yes2" down the column indicating paired
duplicates, and the occasional "no" meaning a single unpaired (unique)
value.

If you wanted to use conditional formatting on the values in column A,
then you can use Formula Is and then =LEFT(C1,3)="yes" and set your
colour.

I've just realised that this doesn't necessarily pair +10 with -10, so
if you have, say, three +10s and one -10 this would indicate all four
10s would be paired - does this matter?

If you wanted the data in the original order, then you should first
enter a simple sequence in column D (say), i.e. 1, 2, 3, 4 etc before
sorting the data (including column D) on column B and using the
formulae. Then fix the values in column C and re-sort the data by
column D to get it back to the original sequence - column D can then
be deleted.

Anyway, hope this helps.

Pete
 
M

Max

One play which can deliver both the CF desired and a way to extract the
cells in col A which do not cancel out ..

Assuming source numbers within A1:A100, as posted

Put in B1:
=IF(A1="","",COUNTIF($A$1:A1,A1))

Put in C1, then array-enter the formula by pressing CTRL+SHIFT+ENTER
(instead of just pressing ENTER):
=IF(B1="","",IF(ISNUMBER(MATCH(-A1&"_"&B1,$A$1:$A$100&"_"&$B$1:$B$100,0)),"",ROW()))
Select B1:C1, copy down to C100. Col C will "flag" only cells in col A which
do not cancel out.

Then you could easily apply CF to highlight col A pointing to col C
Select col A (A1 active), then apply CF using Formula Is:
=$C1<>""
Format the fill color to taste, OK out

And if you want to extract those cells in col A which do not cancel out in
another col (this might be useful), just put in say E1:
=IF(ROW()>COUNT(C:C),"",INDEX(A:A,SMALL(C:C,ROW())))
Copy down to E100. This will return all cells in col A which do not cancel
out, neatly bunched at the top in col E

---
 
G

Guest

M/s Pogster,Pete , Max and Tom,
Hi everybody. I am a learner in usage of Excel functions as well as VBA. I
hope the Excel function solutions offered by you experts will provide me more
insights. Meanwhile, I looked at Pogster's problem as a VBA problem and
tried to write a programme ( the first full fledged one I am writing
struggling with the codes & Help feature in VBA). I, understand his
requirement is to mark off one value with equivalent negative value whenever
it appears in the data range. My other assumptions were: i) Even a
neagative value may precede a positive value. ii) If there is only a pair of
positive values or negative value both will remain unhighlighted. iii) The
first opposite value has to be the basis for markoff iv)The marked off
entries have to have colored so that they can be distinguished from the
outstanding ones v) The programme has to proceed to look into the entire
range until it encounters a blank cell in the same column.

I have called the programme markOff. It has to be run through VBA say
using F5. Before commencing the execution, the cursor has to be kept on the
first cell of the range in the excel sheet. I have not prepared any code for
a button or for a message box (say, for asking the user whether the cursor is
in the first cell and if not to keep it there to proceed further), as writing
this programme itself has been almost a day's job taking away my weekend and
I am also required to learning coding for buttons and msgbox. The code is
given below. I request the experts to look into it and suggest improvement,
if any needed to make it more efficient and economical. The data I have
taken as the basis for testing the programme is given first, followed by the
programme.

A
-12
12
15
14
-15
13
-16
15
16
13
16
17
(intentionally left blank to see whether execution stops here or not)
17

The code:
----------------------------------------------------------------
Sub markOff()
'
' markOff Macro
' Macro recorded 29/09/2007 by Balan
'

Dim Num As Range
Dim Val As Double
Dim addr As String
Dim rownum As Integer
Dim colnum As Integer

rownum = ActiveCell.Row + 1
colnum = ActiveCell.Column
addr = ActiveCell.Address
Val = ActiveCell.Value
Application.Goto Reference:=Worksheets("Sheet1").Cells(rownum, colnum)

Do While ActiveCell.Value <> ""
If ActiveCell.Value = -Val Then
If ActiveCell.Interior.ColorIndex = 6 Then
rownum = ActiveCell.Row + 1
Application.Goto Reference:=Worksheets("Sheet1").Cells(rownum,
colnum)
addr = ActiveCell.Address
Val = ActiveCell.Value
rownum = ActiveCell.Row + 1
Application.Goto Reference:=Worksheets("Sheet1").Cells(rownum,
colnum)

Else
ActiveCell.Interior.ColorIndex = 6
Range(addr).Interior.ColorIndex = 6
Application.Goto Reference:=Worksheets("Sheet1").Range(addr)
rownum = ActiveCell.Row + 1
Application.Goto Reference:=Worksheets("Sheet1").Cells(rownum,
colnum)
Do While ActiveCell.Interior.ColorIndex = 6

rownum = ActiveCell.Row + 1
Application.Goto Reference:=Worksheets("Sheet1").Cells(rownum,
colnum)
Loop


addr = ActiveCell.Address
Val = ActiveCell.Value
rownum = ActiveCell.Row + 1
Application.Goto Reference:=Worksheets("Sheet1").Cells(rownum,
colnum)

End If
Else
rownum = ActiveCell.Row + 1
Application.Goto Reference:=Worksheets("Sheet1").Cells(rownum,
colnum)
If ActiveCell.Value = "" Then
Application.Goto Reference:=Worksheets("Sheet1").Range(addr)
rownum = ActiveCell.Row + 1
If ActiveCell.Value <> "" Then
Application.Goto
Reference:=Worksheets("Sheet1").Cells(rownum, colnum)
Do While ActiveCell.Interior.ColorIndex = 6
rownum = ActiveCell.Row + 1
Application.Goto
Reference:=Worksheets("Sheet1").Cells(rownum, colnum)
Loop
addr = ActiveCell.Address
Val = ActiveCell.Value
rownum = ActiveCell.Row + 1
Application.Goto
Reference:=Worksheets("Sheet1").Cells(rownum, colnum)
End If
End If
End If
Loop

End Sub
 
G

Guest

Mr Tom
Pl see my response to this question. I shall be grateful for your comments
/ suggestions.
 
G

Guest

Mr.Max,
Pl see my response to this question. I shall be grateful for your comments
/ suggestions.
 
P

pogster

Balan,

Thanks so much for taking the time to write a script for this
dilemma. In my attempts to execute the VBA macro script in excel, i
recieved a compile erorr: syntax error which caused the macro not to
run. The error occured at this point:

Application.Goto
Reference:=Worksheets("Sheet1").Cells(rownum,
colnum)

Beneath the first DoWhile Loop.

I am not much a programmer and so am not sure as to why this
happened.
 
P

pogster

Max,
thank you for your valiant effort on this problem of mine.

To comment on the solution you provided, i used your formulas and used
conditional formatting but received results that were not accurate.
Maybe if i emailed you the actual data set i am working with and
showed to you how your formatting worked out, it would be easier.

In the meantime, what happened was that in column B was that numbers
ranging from 1 to 16 showed up in various places. What did you mean
for this column to do? It had 1's next to values which i knew
repeated in the negative, so i dont think that count was accurate.

With column C, it basically counted each row position in order except
for about 10 or 15 rows in the middle of my long dataset in column
A.

As for the conditional formatting, since column C did not work out
exactly how i think you had anticipated, the formatting was also off.
Although the formatting did what it was supposed to do, just that
column C seemed to be of little use in this case.

I will email you the dataset so that you can see for yourself, the
results i achieved with your formulas.

What exactly did you mean for columns B and C to do? I am not the best
with understanding the code in excel.

THank you again for all of your help in this matter.
 
P

pogster

Pete,

Thanks so much for your attempt at this problem.

It is very important that all of the positives cancel with all of the
negative and leave the unique values untouched. A -10 should only
cancel with a single +10, etc...

I have tried a variation of the suggestion you offered, and though it
works in a case where this does not come into play, my needs require a
very accurate canceling of unique opposing values.

I wish there was a way i could just post the raw list of values i am
working with so it would be easier to understand.

Thanks again for your help Pete.
 
P

Pete_UK

I don't have time to code this for you, but a VBA solution could work
along the lines of:

add a sequence to column B

sort the data in both columns in (say) descending order of column A
(all the positive numbers will come first, down through zero then
negative numbers)

start looking at both ends of the list with two variables - top and
bottom

if cell(top) = cell(bottom) then colour both cells: increment top:
decrement bottom
else if cell(top) > magnitude of cell(bottom) then increment top
else decrement bottom

do this until top>=bottom

re-sort the data back to how it was using sequence in column B

delete column B.

Hope this helps.

Pete
 
I

ilia

Let's suppose your column that contains the values indicated is a
named range called MyRange. The following multi-cell array formula
will return a unique occurence of each uncancelled item:

=INDEX(IF(MyRange>0,ABS(MyRange)*(COUNTIF(MyRange,"="&MyRange)-
COUNTIF(MyRange,"="&-MyRange)),
0),SMALL(IF(MATCH(IF(MyRange>0,ABS(MyRange)*(COUNTIF(MyRange,"="&MyRange)-
COUNTIF(MyRange,"="&-MyRange)),
0),IF(MyRange>0,ABS(MyRange)*(COUNTIF(MyRange,"="&MyRange)-
COUNTIF(MyRange,"="&-MyRange)),0),
0)=ROW(INDIRECT("1:"&ROWS(IF(MyRange>0,ABS(MyRange)*(COUNTIF(MyRange,"="&MyRange)-
COUNTIF(MyRange,"="&-MyRange)),
0)))),MATCH(IF(MyRange>0,ABS(MyRange)*(COUNTIF(MyRange,"="&MyRange)-
COUNTIF(MyRange,"="&-MyRange)),
0),IF(MyRange>0,ABS(MyRange)*(COUNTIF(MyRange,"="&MyRange)-
COUNTIF(MyRange,"="&-MyRange)),0),
0),""),ROW(INDIRECT("1:"&ROWS(IF(MyRange>0,ABS(MyRange)*(COUNTIF(MyRange,"="&MyRange)-
COUNTIF(MyRange,"="&-MyRange)),0))))))

If there's an extra negative, it will return the negative, and if
there's an extra positive, it will return the positive. One zero will
be returned for all matching pairs. It will return #NUM error if your
array range is longer than the list. So, in this set:

10
-10
-10
11
-11
12
13
14
-14
15
-15
-15
16
16
-16


The result is this, assuming you array-entered the above formula in an
8-row range:

-10
0
12
13
-15
16
#NUM!
#NUM!


Hope this helps.
-Ilia
 
P

pogster

Heres a link to the source list i am trying to work with. There are
many more where this one came from.

Madenning
Dilemma2.xls


Thanks Gord for the ftp suggestions, and Pete again for the great
idea. Ill see if i can whip up some code for that.

Ilia, i will try out your solution in a second, thanks!
 
M

Max

Here's a working sample to illustrate:
http://cjoint.com/?kbafj66Yrv
NettOffPosnNeg.xls

Note that the CF suggested earlier
is to highlight cells is col A which *do not* cancel out.

If you want to CF it the other way round,
use the converse formula: =$C1=""

The formulas in col B and C need to be implemented correctly. They need to
be copied all the way down to the last row of data in source col A.

My suggestion as applied to your sample is available here:
http://www.flypicture.com/download/MzcyODY=
Madenning_Dilemma2_1.xls
 
M

Max

Sorry, pl disregard this earlier sample as I forgot to adapt the range
in col C's array formula to suit the actual extent of your sample data
My suggestion as applied to your sample is available here:
http://www.flypicture.com/download/MzcyODY=
Madenning_Dilemma2_1.xls


Here's the corrected sample:
http://www.flypicture.com/download/MzczMDU=
Madenning_Dilemma2_2.xls

The array formula in the top cell C1 should be:
=IF(B1="","",IF(ISNUMBER(MATCH(-A1&"_"&B1,$A$1:$A$2197&"_"&$B$1:$B$2197,0)),"",ROW()))

since your data was down to that row, 2197

No change to the formulas in B1 and E1.
All formulas in B1:C1, and in E1 to be copied down to row2197

---
 
M

Max

Some explanations:

Col B serves to assign an arbitrary unique "suffix" to the data in col A.
This suffix assignment is necessary to distinguish between all multiple
occurences of the source numbers which you have in col A. Eg there could be
multiple occurences of 10, -10, 11, -11, etc all the way down in col A.

Col C then checks for the match of the original source numbers in col A
concatenated with the corresponding suffixes in col B against the full table
array composed. Where it matches (ie cancels out), the expression returns a
null string. Where it doesn't match, it returns an arbitrary row number as a
flag. This flag can then be used in the CF formula to format col A for all
the non-cancellations, ie all the unique source numbers in col A which do
not cancel out

The flag could also be used as well in another col E, to "float up" all the
non-cancellations from col A.

Easiest to see what's happening by referring
to this small working sample (as posted earlier):
http://cjoint.com/?kbafj66Yrv
NettOffPosnNeg.xls


---
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top