formula to count charcter in a range

  • Thread starter Thread starter Dylan @ UAFC
  • Start date Start date
Can you help wiht the range.
Got for a single cell, =len(a1)
no problem

but =LEN(a1:a100) returns #value
 
Glenn said:
sure to commit the array-formulas with CTRL+SHIFT+ENTER).
....

Or replace all SUM calls below with SUMPRODUCT and enter them as
regular formulas.
4.  Contains exact case (result = 6)

=COUNT(--(FIND(B1,A1:A15)>0))

Inefficient. FIND will either be > 0 or an error value, so

=COUNT(FIND(B1,A1:A15))

would be sufficient since it'll count only the positive numbers, not
the errors.
5. Starting with either case (result = 8)

=SUM(--(LEFT(A1:A15,1)=B1))

Inefficient. Use

=COUNTIF(A1:A15,B1&"*")
6. Total occurrences of exact case (result = 9)

=SUM(LEN(A1:A15))-SUM(LEN(SUBSTITUTE(A1:A15,B1,"")))
....

Inefficient. Use

=SUMPRODUCT(LEN(A1:A15)-LEN(SUBSTITUTE(A1:A15,B1,"")))
8. Total occurrences of either case (result = 18)

=SUM(LEN(A1:A15))-SUM(LEN(SUBSTITUTE(SUBSTITUTE(A1:A15,LOWER(B1),""),
UPPER(B1),"")))

Inefficient. Use

=SUMPRODUCT(LEN(A1:A15)-LEN(SUBSTITUTE(LOWER(A1:A15),LOWER(B1),"")))
 
Suppose the range is A1 thru A10.

=SUM(LEN(A1:A10))

This is an array formula that must be entered with CNTRL-SHFT-ENTER rather
than just the ENTER key.
 
8. Total occurrences of either case (result = 18)
Inefficient. Use

=SUMPRODUCT(LEN(A1:A15)-LEN(SUBSTITUTE(LOWER(A1:A15),LOWER(B1),"")))

How would this formula compare efficiency-wise?

=SUMPRODUCT(COUNTIF(A1:A15,"*"&REPT("w*",ROW(1:3))))

where the 1:3 could be expanded to 1:N where N would be the maximum number
of "w"s or "W"s that could appear in a single cell.
 
I'm guessing the answer will depend on the size of N. For example, when N is
3 as in this case, the SUMPRODUCT will only iterate 3 times, but if N were
100, then it would have to iterate 100 times. Your function will always
iterate the as many times as there are rows in the range. So, the more rows,
greater in number than N, the better I would think for my formula as
compared to yours. That probably means your answer to my original question
to you will be "it depends".<g>
 
....

Changing arguments for clarity.
How would this formula compare efficiency-wise?

=SUMPRODUCT(COUNTIF(range,"*"&REPT(char&"*",instance_array)))
....

My formula would loop through range 5 times: the first LEN call, the
LOWER call, the SUBSTITITE call, the second LEN call, and the final
SUMPRODUCT call. However, my SUBSTITUTE call would only iterate
through range once, and perform a simple operation - effectively
deleting all w's.

Your formula would iterate through range as many times as there are
entries in instance array for the COUNTIF call, in your example 3
times. However, processing a range with COUNTIF where the second
argument contains wildcards isn't as simple as my SUBSTITUTE call. If
COUNTIF special cases patterns in which the first, last or both chars
are *, then COUNTIF should process the pattern "*w*" at least as fast
as SUBSTITUTE(range,"w","",1) would, and probably faster. However,
it's unlikely COUNTIF special cases patterns with *'s between literal
characters. When those occur, you need significantly more logic FOR
EACH entry in range, approximating an inner loop.

Your formula as you wrote it could effectively require 6 iterations
through range: once for "*w*", effectively 2 for "*w*w*", and
effectively 3 for "*w*w*w*". That is, if R were the number of entries
in range, A were the number of sequential entries in instance_array,
then my formula would be O(N) with a large constant (c), but yours
would be O(N A^2) with a small constant (d). As long as c > d A^2,
yours would be faster. But as soon as c < d A^2, mine would be faster.
I'd guess yours would always be faster for A = 2, usually faster for A
= 3 except when most entries in range have at least 2 w's, and seldom
if ever faster for A >= 4.
 
Thanks for the analysis. I assume that for this part...

Your formula as you wrote it could effectively require
6 iterations through range: once for "*w*", effectively
2 for "*w*w*", and effectively 3 for "*w*w*w*".

you use of the word "effectively" is meant to cover the fact that, for
example, in "*w*w*", once one 'w' is found, the search loop must continue on
in order to look for the next 'w'.
 
Back
Top