Counting number of times a specific word appears in a single cell

K

Ken Curtis

Hi,
I am using a function (=COUNTIF(data,"*"&nn&"*") that searches a range of
cells and tells me how many times "nn" appears. It's fine. However, if "nn"
appears twice in a single cell it is only counted once - not twice. How do I
have "nn" (or anything else) counted for the number of times it appears in a
range including it appearing twice (or more times) in a single cell?
 
J

Jacob Skaria

With your range in A1:A20 and the search string in cell B1; try the below
formula and feedback. Please note that this is an array formula. Within the
cell in edit mode (F2) paste this formula and press Ctrl+Shift+Enter to apply
this formula. If successful in 'Formula Bar' you can notice the curly braces
at both ends like "{=<formula>}"

=SUM(LEN(A1:A20)-LEN(SUBSTITUTE(A1:A20,B1,"")))/LEN(B1)

If this post helps click Yes
 
T

T. Valko

It depends on how "nn" is to be treated. Is "nn" to be considered a "word"
by itself or can it be part of any word?

For the best possible solution it would be a good idea if you posted some
samples of your data and the expected result.
 
J

Jacob Skaria

Missed to mention that the below formula is case-sensitive..If you want to
override that use the below instead..

=SUM(LEN(A1:A20)-LEN(SUBSTITUTE(UPPER(A1:A20),UPPER(B1),"")))/LEN(B1)

If this post helps click Yes
 
K

Ken Curtis

T.Valko

"nn" is a 'word'
--
Ken Curtis


T. Valko said:
It depends on how "nn" is to be treated. Is "nn" to be considered a "word"
by itself or can it be part of any word?

For the best possible solution it would be a good idea if you posted some
samples of your data and the expected result.
 
J

Jacob Skaria

Word match in a range....

To search words
=SUM(LEN(" " & SUBSTITUTE(A1:A20," "," ") & " ")-LEN(SUBSTITUTE(" " &
SUBSTITUTE(UPPER(A1:A20)," "," ") & " "," " & UPPER(B1) & " ","")))/LEN(" "
& B1& " ")

To search words(case sensitive)
=SUM(LEN(" " & SUBSTITUTE(A1:A20," "," ") & " ")-LEN(SUBSTITUTE(" " &
SUBSTITUTE(A1:A20," "," ") & " "," " & B1 & " ","")))/LEN(" " & B1& " ")


If this post helps click Yes
 
K

Ken Curtis

Jacob, Thanks for your help. However: Where do I paste your formula?
The data range is D4:T15, and the 'Countif' function (where I want my
result) is in D34. By the way, I need "nn" (or anything else) to be treated
as a word.
 
T

T. Valko

"nn" is a 'word'

OK, that means "nn" should have spaces on either side of it if it's within
the string.

nn is the nn
it's nn time!
no time for nn

Based on those examples this formula returns 4:

All on one line

=SUMPRODUCT(LEN(" "&A1:A3&" ")-LEN
(SUBSTITUTE(" "&UPPER(A1:A3)
&" "," NN ","")))/4

If you want to use a cell to hold the criteria:

C1 = nn

=SUMPRODUCT(LEN(" "&A1:A3&" ")-LEN
(SUBSTITUTE(" "&UPPER(A1:A3)
&" ",UPPER(" "&C1&" "),"")))/(LEN(C1)+2)
 
J

Jacob Skaria

Ken

Few points

1. The formula which you posted do not give a word count but only a string
count; isnt it. but now you are looking for a word count..right?

2. Paste the formula in D34. Again this is an array formula. Within the
cell in edit mode (F2) paste this formula and press Ctrl+Shift+Enter to apply
this formula. If successful in 'Formula Bar' you can notice the curly braces
at both ends like "{=<formula>}"

3. There are two versions of the formula; one is case senstive and the other
not. Choose as required.

With the seach word in cell A1 try the below..

To search words
=SUM(LEN(" " & SUBSTITUTE(D4:T15," "," ") & " ")-LEN(SUBSTITUTE(" " &
SUBSTITUTE(UPPER(D4:T15)," "," ") & " "," " & UPPER(A1) & " ","")))/LEN(" "
& A1 & " ")

To search words(case sensitive)
=SUM(LEN(" " & SUBSTITUTE(D4:T15," "," ") & " ")-LEN(SUBSTITUTE(" " &
SUBSTITUTE(D4:T15," "," ") & " "," " & A1 & " ","")))/LEN(" " & A1 & " ")

If this post helps click Yes
 
K

Ken Curtis

This is going to seem rather odd, 'T.Valko', but your formula (which does
work) only works if there is something between two 'nn's. For example in a
single cell with two 'nn's in it, only one 'nn' is counted unless another
word (for example 'ai') is placed between the two 'nn's. Put another way: nn
is counted as one nn, nn ai nn is counted as two nn's, but nn nn is only
counted as ONE nn NOT two. Suggestion?
 
K

Ken Curtis

Your formula works wonderfully ... except for one odd thing: It does not
count two consecutive 'nn's. For example: nn returns "1", nn ai nn returns
"2", but nn nn only returns 1. Am I missing something?
 
K

Ken Curtis

Yes, this works ... sort of. Here's the odd problem: 'nn' returns 1 , 'nn
ai nn' returns 2, however 'nn nn' returns only 1 (not 2, as it should).
Suggestions?
 
K

Ken Curtis

Please ignore my double entry. When I posted the forst one a 'box' popped up
to say that the service was 'temporarly unavailable', so I thought the
message had not been sent.
 
R

Rick Rothstein

Give this variation of Biff's formula a try...

=SUMPRODUCT(LEN(SUBSTITUTE(" "&A1:A10&" "," "&C1&" "," "&C1&" "))-
LEN(SUBSTITUTE(UPPER((SUBSTITUTE(" "&A1:A10&" "," "&C1&" "," "&
C1&" "))),UPPER(" "&C1&" "),"")))/(LEN(C1)+2)
 
K

Ken Curtis

Yes, I will. Thanks. But ... I do not see how your formula will be able to
count how many times the string 'nn' (or whatever) appears in a cell.
T.Valco had a great (and short) solution. It is this: =SUMPRODUCT(LEN("
"&D4:D15&" ")-LEN(SUBSTITUTE(" "&(D4:D15)&" "," nn ","")))/4 . But it has a
problem. If there is one nn in the cell it returns 1as the count, but if
there are two of them (i.e. nn nn), it still counts 1. Oddly, if I seperate
the string (nn ai nn) it then counts two nn's - which is the correct result.
Do you have any thoughts on this? Oh, it also counts 2 consecutive nn's if I
put two spaces between them - which, obviously, I do not want to do.
 
J

Jacob Skaria

Ken, have you tried the earlier one which handled this...?

=SUMPRODUCT(LEN(" " & SUBSTITUTE(D4:T15," "," ") & " ")-LEN(SUBSTITUTE(" "
& SUBSTITUTE(UPPER(D4:T15)," "," ") & " "," " & UPPER(A1) & " ","")))/LEN("
" & A1 & " ")

If this post helps click Yes
 
R

Rick Rothstein

First off, the version I gave you was for the case where your word was
placed in a cell, not hard coded into the formula (it is more flexible that
way). Second, you said the formula (you posted in your message) didn't work
right, so what does it matter how much shorter it is if it does do what you
want? Third, the reason why Biff's formula doesn't work right for
consecutive words is because the substitution "eats up" the blank space
between the words when processing the first occurrence of the consecutive
word which removes it from being available when the second of the
consecutive words is processed, hence the second word no longer has a
leading space to identify it as an individual word.... my modification adds
extra blanks around the word you are looking for so that multiple
consecutive occurrences will have their own spaces around them when they get
processed. Fourth, in looking at Jabob's offered formula, I see he uses the
same concept as I just described and his formula appears to work correctly
as well... however, his formula is more compact than the modified formula
that I posted, so I would use Jacob's formula instead.
 
T

T. Valko

This is why I asked you to post some samples!

As you can see, the formula has gotten to be somewhat complicated. This is
because the formula has to be able to distinguish "nn" as a unique word so
that you don't get "false positives" associated with other words that might
also contain the string "nn", for example: beginning.

If "nn" is already a unique word (no other words will also contain the
string "nn") then you can probably use a less complicated formula.

That's why I asked you to post some samples!

Without a very explicit explanation or samples we can only take out best
guess.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top