assigning a unique value

G

Guest

I have a spreadsheet of names, and I need to assign a unique number to each
of the names. Some entries are repeated, so each duplicate should have the
same number. The numbers assigned would be in a different column than the
names. For example, the result I need would look like this:
1 John Smith
1 John Smith
2 Carla Jones
3 Jenny Fortuna
4 James Wood
4 James Wood

and so on. Is there a formula that I can use to accomplish this, so I don't
have to go through the entire spreadsheet and type these numbers in? If
anyone can help, thanks.
 
A

Alan Beban

Dino said:
I have a spreadsheet of names, and I need to assign a unique number to each
of the names. Some entries are repeated, so each duplicate should have the
same number. The numbers assigned would be in a different column than the
names. For example, the result I need would look like this:
1 John Smith
1 John Smith
2 Carla Jones
3 Jenny Fortuna
4 James Wood
4 James Wood

and so on. Is there a formula that I can use to accomplish this, so I don't
have to go through the entire spreadsheet and type these numbers in? If
anyone can help, thanks.

If your data is in B1:Bwhatever, then enter =1 in A1; then in A2 enter
=IF(B2=B1,A1,A1+1) and fill down

Alan Beban
 
H

Harlan Grove

Alan Beban wrote...
....
If your data is in B1:Bwhatever, then enter =1 in A1; then in A2 enter
=IF(B2=B1,A1,A1+1) and fill down

You're assuming all duplicate names are grouped together. If duplicate
names wouldn't necessarily be grouped together, use 1 in A1 and

A2:
=A1+ISERROR(MATCH(B2,B$1:B1,0))

and fill down.
 
G

Guest

IF your name range is in column A beginning from Column A2 type this on
column B2 and copy it all the way down

=IF(COUNTIF($A$2:A2,A2)>1,VLOOKUP(A2,$A1:B$2,2,0),MAX($B$1:B1)+1)
 
A

Alan Beban

Harlan said:
Alan Beban wrote...
...



You're assuming all duplicate names are grouped together. If duplicate
names wouldn't necessarily be grouped together, use 1 in A1 and

A2:
=A1+ISERROR(MATCH(B2,B$1:B1,0))

and fill down.
????!

Alan Beban
 
A

Alan Beban

Alan said:
????!

Alan Beban

For the case in which the names might not be grouped together, I have a
solution with two helper columns that I will post if something more
efficient doesn't show up.

Alan Beban
 
H

Harlan Grove

N Harkawat wrote...
IF your name range is in column A beginning from Column A2 type this on
column B2 and copy it all the way down

=IF(COUNTIF($A$2:A2,A2)>1,VLOOKUP(A2,$A1:B$2,2,0),MAX($B$1:B1)+1)
....

The issue with COUNTIF is that it'll iterate through all cells in its
1st argument range. MATCH with 0 3rd argument will return on finding
the first match.

Also, OP's data had names in col B and numbers in col A. In that case,
need to use INDEX(.,MATCH()) rather than VLOOKUP.

For one cell per result, try these formulas.

A1:
1

A2:
=1+(B2<>B1)

A3:
=IF(ISERROR(MATCH(B3,B$1:B2,0)),MAX(A$1:A2)+1,
INDEX(A$1:A2,MATCH(B3,B$1:B2,0)))

Fill A3 down as needed.

For recalc efficiency, better to enter the MATCH calls and cache the
running max col A values in other columns, so 3 cells per result. A1
and A2 cells remain the same as above, but A3 down need 2 ancillary
cells (I'll use cols X and Y).

A3:
=IF(ISERROR(X3),MAX(A$1:A2)+1,INDEX(A$1:A2,X3))

X3:
=MATCH(B3,B$1:B2,0)

Y3:
=MAX(A1:A2)

Y4:
=Y3+ISERROR(X3)

Fill A3 and X3 down as needed, fill Y4 down as needed.
 
A

Alan Beban

And here's an alternative set of formulas:

A1: =VLOOKUP(B1,B$1:D1,3,0)

C1: =SUM(1/COUNTIF(B$1:B1,B$1:B1)) array entered

D1: 1

D2:
=IF(B$2=B$1,IF(COUNTIF(B$1:B2,B2)=1,MAX(C$1:C1)+1,MATCH(B2,B$2:B2,0)),IF(COUNTIF(B$1:B2,B2)=1,MAX(C$1:C1)+1,MATCH(B2,B$1:B2,0)))

Fill down A1, C1, D2 to the row of the end of the name list.

Alan Beban
 
H

Harlan Grove

Alan Beban said:
And here's an alternative set of formulas:

A1: =VLOOKUP(B1,B$1:D1,3,0)

C1: =SUM(1/COUNTIF(B$1:B1,B$1:B1)) array entered

Or without array entry,

=SUMPRODUCT(1/COUNTIF(B$1:B1,B$1:B1))
D1: 1

D2:
=IF(B$2=B$1,IF(COUNTIF(B$1:B2,B2)=1,MAX(C$1:C1)+1,
MATCH(B2,B$2:B2,0)),IF(COUNTIF(B$1:B2,B2)=1,MAX(C$1:C1)+1,
MATCH(B2,B$1:B2,0)))

Starting with row 2, the function call counts on each row are

1 VLOOKUP O(K)
1 COUNTIF(r,r) O(K^2)
1 SUM[PRODUCT] O(K)
2 COUNTIF(r,x) O(K) both identical
2 MAX O(K) both identical
2 MATCH O(K)
3 IF O(1)
12 in total

where K is the row number. Copied down through N rows, the COUNTIF calls in
col C dominate, making the overall approach O(N^2 log(N)).

Compare the foregoing to the following single cell formula.

A3:
=IF(ISERROR(MATCH(B3,B$1:B2,0)),MAX(A$1:A2)+1,
INDEX(A$1:A2,MATCH(B3,B$1:B2,0)))

Starting in row 3, the function call counts on each row are

2 MATCH O(K) both identical
1 MAX O(K)
1 INDEX O(1)
1 ISERROR O(1)
1 IF O(1)
6 in total

N rows each containing such formulas, so overall O(N log(N)).

What's the benefit of your formulas? It's clearly not efficiency, either in
terms of recalc speed, disk storage or RAM usage.

Then there's the 3 cell/result formulas. Fixing the A3 and Y4 formulas,

A3:
=IF(ISERROR(X3),Y3,INDEX(A$1:A2,X3))

X3:
=MATCH(B3,B$1:B2,0)

Y3:
=MAX(A1:A2)

Y4:
=Y3+ISERROR(X4)

The Y3 formula is O(1), constant time. Starting with row 4,

1 INDEX O(1)
2 ISERROR O(1)
1 IF O(1)
1 MATCH O(K)
5 in total

Over N rows, O(N log(N)) again, but with a constant factor reduction since
only one O(K) function call on each row.

Again, what's the benefit of your formulas?
 
G

Guest

Thanks for all your input. I have to try these solutions and see what the
results are.
 
A

Alan Beban

Harlan said:
. . .
What's the benefit of your formulas? It's clearly not efficiency, either in
terms of recalc speed, disk storage or RAM usage.

I didn't claim any advantage for those formulas; simply that they were
an alternative. I started working on them in an effort to provide
something that might be useful after that little snippet of silliness
that you proffered as a solution(!) for the case of an unsorted list of
names in your first posting in this thread. (By the way, some might be
wondering about the acknowledgment of your mistakes that you are wont to
claim you always step up to.) Since I had done the work, I posted the
result.

The differences in performance of the several approaches,
notwithstanding your esoteric analysis, are probably trivial for many
(most?) users in many (most?) circumstances.

But my posting the formulas did provide you another opportunity to
pontificate; so you ought to be grateful, not snotty. But then you
wouldn't be Harlan Grove, would you?

Alan Beban
 
A

Alan Beban

Just out of curiosity, are your names grouped or ungrouped? If they are
in fact grouped, then the simplest approach first posted should be
considered.

Alan Beban
 
H

Harlan Grove

Alan Beban wrote...
....
I didn't claim any advantage for those formulas; simply that they were
an alternative. . . .

As in '=1+1+1+1+1+1+1' is an alternative way of representing 7?

Some alternatives are bad ideas. Sadly, some people are incapable of
grasping that simple fact.
. . . I started working on them in an effort to provide
something that might be useful after that little snippet of silliness
that you proffered as a solution(!) for the case of an unsorted list of
names in your first posting in this thread. (By the way, some might be
wondering about the acknowledgment of your mistakes that you are wont to
claim you always step up to.) Since I had done the work, I posted the
result.

Reread my preceding message to which you responded. Especially, notice
the phrase, "Fixing the A3 and Y4 formulas." I'll be more explicit,

FIXING *MY* PREVIOUS A3 and Y4 FORMULAS, AND THE FORMULAS IN MY FIRST
RESPONSE DIDN'T WORK AT ALL.

Happy?
The differences in performance of the several approaches,
notwithstanding your esoteric analysis, are probably trivial for many
(most?) users in many (most?) circumstances.
....

In which case there are considerable advantages to the single
formula/result approach, namely,

A3:
=IF(ISERROR(MATCH(B3,B$1:B2,0)),MAX(A$1:A2)+1,
INDEX(A$1:A2,MATCH(B3,B$1:B2,0)))

As for esoteric, it explains a lot that you're apparently indifferent
between O(N) and O(N^2) approaches to solving problems.
 
H

Harlan Grove

Alan Beban wrote...
Just out of curiosity, are your names grouped or ungrouped? If they are
in fact grouped, then the simplest approach first posted should be
considered.

Granted. If names are grouped, then Alan's original formula is optimal.

However, OP showed unsorted sample data. What's the *reasonable*
assumption to make when presented with unsorted data?
 
H

Harlan Grove

Harlan Grove wrote:
....
. . . If duplicate names wouldn't necessarily be grouped together,
use 1 in A1 and

A2:
=A1+ISERROR(MATCH(B2,B$1:B1,0))

Full disclosure: this is FUBAR.
 
A

Alan Beban

Harlan said:
Alan Beban wrote...



Granted. If names are grouped, then Alan's original formula is optimal.

However, OP showed unsorted sample data.

No; the OP showed grouped, unsorted sample data.
What's the *reasonable*
assumption to make when presented with unsorted data?

Oh please! Why are you driven to making a contest out of the dumbest
issues? The question is whether the OP showed grouped, unsorted data
because his real data is grouped and unsorted, or because he didn't
realize that his sample data didn't represent his real data. Either
assumption seems reasonable to me; and in the absence of any knowledge
about the OP, other than that he went to the trouble of grouping the
data, the first seems a bit more reasonable than the second. Might not
be the correct assumption, but certainly reasonable.

Alan Beban
 
A

Alan Beban

Harlan said:
. . .
Reread my preceding message to which you responded. Especially, notice
the phrase, "Fixing the A3 and Y4 formulas." I'll be more explicit,

FIXING *MY* PREVIOUS A3 and Y4 FORMULAS, AND THE FORMULAS IN MY FIRST
RESPONSE DIDN'T WORK AT ALL.

Happy?
. . .

I don't care a fig either way; you're the one who's made a big point of
acknowledging your mistakes.

Alan Beban
 
H

Harlan Grove

Alan Beban wrote...
Harlan Grove wrote: ....
....
. . . The question is whether the OP showed grouped, unsorted data
because his real data is grouped and unsorted, or because he didn't
realize that his sample data didn't represent his real data. . . .
....

In my experience grouped but unsorted data is exceedingly rare because
far & away the easiest way to group data is to sort it. Also, and you
may come to realize this as you gain experience in these newsgroups,
OPs seldom provide sample data that's truly representative of their
real data.
. . . other than that he went to the trouble of grouping the data, .
.. .
....

Unlikely the OP went to any trouble doing so. More likely he did it to
make the repeating codes in the first column more obvious.
 
A

Alan Beban

Harlan said:
. . .
OPs seldom provide sample data that's truly representative of their
real data.
. . .

Pure, unadulterated, characteristically arrogant and self-serving BS.

Alan Beban
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top