Counting Unique entry from Concatenated list

  • Thread starter Thread starter Guest
  • Start date Start date
G

Guest

example of Data
--------------------
Col A Col B (By using CONCATENATE function)
A a,a,b,d
B d,d,c,a
C b,h,r

what i need in Col C is based on the criteria in Col A how many unique
entries are there in Col B, i.e.
if Col A = B then how may unique records are there in Col B, Answer is 3
(d,c,a)

what formula should i use for this,
NOTE : Col B each Concatenated entry is seperated by a ","
 
Where are you concatenating from?

I would think that it's easier to count unique entries in distinct cells
rather than in one
 
concatenating from another 5 different sheets
i need to count unique entries from the Col - B

is there any way to do that,

reagrds
Rajat
 
If your data is in A1, try

=SUMPRODUCT(--(FREQUENCY(FIND(MID(A1,ROW(INDIRECT("1:"&LEN(A1)))*2-1,1),A1),FIND(MID(A1,ROW(INDIRECT("1:"&LEN(A1)))*2-1,1),A1))>0))


This is based on your data being separated by a comma (without spaces).
This part of the formula pulls the 1,3, 5, 7, etc, characters

ROW(INDIRECT("1:"&LEN(A1)))*2-1

If your data contained no separators, this part of the formula changes to
ROW(INDIRECT("1:"&LEN(A1)))
 
You have a solution provided the items in column B are all 1 character long,
if not your problem becomes more complicated and maybe should be handled by a
custom function.
 
Thanks, I forgot to add that caveat.


ShaneDevenshire said:
You have a solution provided the items in column B are all 1 character long,
if not your problem becomes more complicated and maybe should be handled by a
custom function.
 
Hi ShaneDevenshire & JBM

Thanks for your help. And wish you a Happy New Year.

As you said this formula works on 1 character long text,
But can you please gave me the formula if the concatenated text contain
multiple character including space (Rajat Roy,JB M,Shane Devenshire) each
entry will be seperated by a "," (comma)

thanks in advance,

Regards
Rajat
 
example of Data
--------------------
Col A Col B (By using CONCATENATE function)
A a,a,b,d
B d,d,c,a
C b,h,r

what i need in Col C is based on the criteria in Col A how many unique
entries are there in Col B, i.e.
if Col A = B then how may unique records are there in Col B, Answer is 3
(d,c,a)

what formula should i use for this,
NOTE : Col B each Concatenated entry is seperated by a ","

This will work on any string sequences that are separated by ",".

Download and install Longre's free morefunc.xll add-in from
http://xcell05.free.fr/

Then use this array-formula. To enter an array formula, hold down
<ctrl><shift> while hitting <enter>. Excel will place braces {...} around the
formula:

=COUNTDIFF(REGEX.MID(A1,"[^,]+",ROW(
INDIRECT("$1:"&REGEX.COUNT(A1,"[^,]+")))))

If the strings might be longer than 255 characters, a VBA function can be used
to mimic the above.




--ron
 
Ron Rosenfeld wrote...
....
Download and install Longre's free morefunc.xll add-in from
http://xcell05.free.fr/
Agreed.

Then use this array-formula. To enter an array formula, hold down
<ctrl><shift> while hitting <enter>. Excel will place braces {...} around the
formula:

=COUNTDIFF(REGEX.MID(A1,"[^,]+",ROW(
INDIRECT("$1:"&REGEX.COUNT(A1,"[^,]+")))))
....

Avoid volatile functions. One possibility would be using MOREFUNC's
INTVECTOR function rather than ROW(INDIRECT(...)), but for short
strings, I'd just use

=COUNTDIFF(EVAL("{"""&SUBSTITUTE(A2,",",""",""")&"""}"))
 
Ron Rosenfeld wrote...
Thanks for the tip, but could you explain why?

Because formulas calling volatile functions are recalculated every time
anything triggers any recalculation. For example, the nonvolatile
formula =SUM(A1:A4) recalcs only when cells in A1:A4 change, so Excel
won't recalc this formula when you enter something into cell X99 if
none of the cells in A1:A4 depend on X99. However, Excel will
recalculate the formula =SUM(INDIRECT("A1:A4")) whenever you enter
anything into any cell, even if A1:A4 were blank. Lots of formulas
calling volatile functions kills recalc speed.

I'd point to Charles Williams's DecisionModels site, but I'm having
problems accessing it.
 
example of Data
--------------------
Col A Col B (By using CONCATENATE function)
A a,a,b,d
B d,d,c,a
C b,h,r

what i need in Col C is based on the criteria in Col A how many unique
entries are there in Col B, i.e.
if Col A = B then how may unique records are there in Col B, Answer is 3
(d,c,a)

what formula should i use for this,
NOTE : Col B each Concatenated entry is seperated by a ","

This will work on any string sequences that are separated by ",".

Download and install Longre's free morefunc.xll add-in from
http://xcell05.free.fr/

Then use this array-formula. To enter an array formula, hold down
<ctrl><shift> while hitting <enter>. Excel will place braces {...} around the
formula:

=COUNTDIFF(REGEX.MID(A1,"[^,]+",ROW(
INDIRECT("$1:"&REGEX.COUNT(A1,"[^,]+")))))

If the strings might be longer than 255 characters, a VBA function can be used
to mimic the above.




--ron


Based on Harlan's critique, the following formula would be more efficient:

=COUNTDIFF(REGEX.MID(A1,"[^,]+",
INTVECTOR(REGEX.COUNT(A1,"[^,]+"),1)))


--ron
 
Harlan,

I have found, by adding a messagebox to a UDF and making it volatile, that
it fires everytime *any* entry is made anywhere - even if it is not in the
active sheet. Can you tell me if the same also applies to Excel volatile
functions? I can't think of any way of checking that in an Excel function.

--
Regards,


Sandy
In Perth, the ancient capital of Scotland
and the crowning place of kings

(e-mail address removed)
(e-mail address removed) with @tiscali.co.uk
 
Sandy Mann wrote...
I have found, by adding a messagebox to a UDF and making it volatile, that
it fires everytime *any* entry is made anywhere - even if it is not in the
active sheet. Can you tell me if the same also applies to Excel volatile
functions? I can't think of any way of checking that in an Excel function.
....

It's the Volatile setting that causes it to recalc all the time. Note
that cell entry triggers minimal recalc, which triggers recalculation
of volatile functions.

Yes, built-in volatile functions behave the same as volatile UDFs.

Use the following to test that.

Function foo()
MsgBox Prompt:=Application.Caller.Address(0, 0, xlA1, 1),
Title:="Called from"
End Function

This is NOT volatile.

Enter =foo() in A1. You'll see a dialog box and the formula will return
0 (actually it returns the VBA variant value Empty, which is treated
the same as the 'value' of blank cells - Excel converts it to 0). Then
enter =RAND()+foo() in A2. You'll see another dialog - just one - and
it'll return a random number between 0 and 1. Now enter anything in
cell A3. You'll see another dialog for cell A2, but not for A1. Now
press [Ctrl]+[Alt]+[F9]. This time you'll see two dialogs, one for A1
and the other for A2.

Now clear A1:A2 and enter the following:

A1:
1

A2:
2

A3:
3

A4:
foobar

C1:
=SUM(A1:A4)+foo()

D1:
=SUM(INDIRECT("A1:A4"))+foo()

Now repeatedly enter anything in cell A6. Each time you should only see
a dialog for cell D1.
 
Thank you for that Harlan,

I found that on my XL97 [Ctrl]+[Alt]+[F9]. did nothing - must be me being
the poor cousin again :-(

I did however experiment an bit further and found that if in your last test
I entered =Sheet2!A1 in A3, I then got two dialogs, one for D1 and one for
C1, whenever I changed the value in Sheet2!A1 but only one for D1 if I
re-entered the same value again. I suppose that it is obvious really but it
had not occurred to me that linking sheets like that would cause the
function to fire even although it was not volatile or the active sheet.

I was naively thinking that C1 would only recalculate when the sheet became
active. That explains the performance problems I had with a workbook that I
wrote for my previous employer, (I have now retired), where I had 14 sheets
linked together in various ways.

--
Thank you again.

Sandy
In Perth, the ancient capital of Scotland
and the crowning place of kings

(e-mail address removed)
(e-mail address removed) with @tiscali.co.uk
 
I think Harlan and Ron have a good solution. Looks like I've got more
homework to do (learning how to use Laurent's add-in functions).

I would certainly try the other suggestions first as the add-in offers a
number of other functions that look like they'd be handy.

If, by chance, you cannot use it (e.g. your workplace won't allow the add-in
to be installed), you could try a UDF. Post back if that's what you need and
I'll post one (if someone else has not done so).
 
Dear Ron Rosenfeld

thanks for the help and i'm extremely sorry for the late reply
The following formula provided by you worked well
=COUNTDIFF(REGEX.MID(A1,"[^,]+",INTVECTOR(REGEX.COUNT(A1,"[^,]+"),1)))

but i'm having problem when there is only 1 text the formula show the text
not the number i.e.

When Cell A Contain - Formula Result
Roy,Roy,b,c - 3
Roy,,, - Roy

i need to count the unique text entry in the cell, can you please solve it,

Regards

Rajat

That is an interesting issue. I will discuss it with Longre. The issue seems
to be that the single item is not being returned as an array-constant, so
COUNTDIFF apparently returns the item, and not the count.

However, a work around, which forces the single item to be returned as an
array, would be to use the ARRAY.JOIN function:

=COUNTDIFF(ARRAY.JOIN(REGEX.MID(A1,"[^,]+",
INTVECTOR(REGEX.COUNT(A1,"[^,]+"),1))))

This seems to work as well as the previous.


--ron
 
Dear Ron Rosenfeld

I've a problem in this formula also, following formula return value 1 when
cell is blank cell A1 value nil as a result of CONCATENATE function i used in
cell A1

=COUNTDIFF(ARRAY.JOIN(REGEX.MID(A1,"[^,]+",INTVECTOR(REGEX.COUNT(A1,"[^,]+"),1))))

When Cell A Contain - Formula Result
Roy,Roy,b,c - 3
Roy,,, - 1
,,,, - 1

in Cell A i used formula =CONCATENATE(Z1,Y1,X1,K1)
Is there any other work around?

Regards

Rajat

What is happening:

When the REGEX returns nothing, as it will if A1 is empty, then the ARRAY.JOIN
returns a #VALUE! error. COUNTDIFF then counts that as one unique entry.

To correct that problem, we will exclude the #VALUE! error from being counted:

=COUNTDIFF(ARRAY.JOIN(REGEX.MID(A1,"[^,]+",
INTVECTOR(REGEX.COUNT(A1,"[^,]+"),1))),,#VALUE!)

The formula will now return a 0.


--ron
 
another point i forgot to mention after entering the last formula in the cell
a Msg Box appear which is as follows -

Title : Microsofy Visual Basic
Body Message : User-defined type not defined
Button : Ok , Help

is the formula problem is due to this or any other matter? Hope that you can
solve it.

regards

Rajat

I cannot reproduce this error message. Perhaps more detail?


--ron
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Back
Top