Sorting Challenge for Array Wizards

C

Captain_Nemo

To create a alphabetical sorting calculation via worksheet functions, I
took the following approach that requires nine scratch columns. I'd like
to find an array implementation.

Assume the text to be sorted is in Column "A". The eight Columns B:I
contain, filled right and down:

=IF(ISERROR(CODE(UPPER(MID($A1,n,1)))),"00",CODE(UPPER(MID($A1,n,1))))

Where "n" is COLUMN()-1. The MID()'s return a single letter from left
to right in order from $A1.

The UPPER() changes it to upper case, obviously, but more importantly
makes the CODE() return 2-digit ASCII.

The ISERROR() returns text "00" (zero-zero) when the CODE(UPPER(MID())
breaks because it can't handle letters beyond the length of $A1.

Column J contains:

=VALUE(LEFT(B1&C1&D1&E1&F1&G1&H1&I1,15))

The LEFT() returns the 15 left-most digits of the 16-character
concatenation of B:I. The VALUE() turns this into a number at Excel's
limit of significant digits that represents the word in $A1.

For example, "qwerty" becomes VALUE(LEFT(8187698284890000,15)) or
818769828489000 as a number. (Note the trailing zeroes from the
ISERROR()--this is what puts short, few-character, end-of-the-alphabet
words in the right place in the next step.)

Column K contains:

=INDEX(A$1:A$3077,MATCH(SMALL(J$1:J$3077,ROW()),J$1:J$3077,0),1)

ROW() returns 1, 2, 3 ... etc

SMALL() returns the 1st smallest, then the 2nd smallest, then the 3rd,
etc. from ROW()

MATCH(,,0) returns the row the exact match is found. Fill down, and

INDEX() returns the word from Column A in that row, and by gum, 3077
words are sorted. Obviously, only perfect through 7 letter words, and
extremely servicable thru eight and beyond. My test set of varying
length words (not a dictionary) matched exactly the Data Sort done by
menu except when there was punctuation within the word (ie
"alpha-sort").

So now, my challenge for the array wizards: Can some of this be done
without the scratch columns? An alternative but similar approach is to
have the IF(ISERROR()) return zero, and create the 15-significant-number
representation using

SUM(B1*10^-1,C1*10^-3,D1*10^-5,E1*10^-7,F1*10^-9,G1*10^-11,H1*10^-13,I1*1
0^-15) or a similar SUMPRODUCT.

The last significant digit was different 1/3 of the time. In one
approach, the 16th digit is truncated, in the other it rounds. Couldn't
find a different in results, but maybe in efficiency?

Thanks in advance.
....best, Capt N.

--
Email to (e-mail address removed) (yes, you can so figure it out) ;-]

Scream and shout and jump for joy! I was here before Kilroy!

Sorry to spoil your little joke. I was here but my computer broke. ---Kilroy
 
H

Harlan Grove

To create a alphabetical sorting calculation via worksheet functions, I
took the following approach that requires nine scratch columns. I'd like
to find an array implementation.
...

Why bother? If you have strings in a range, such as A1:A100, you could rank them
in ascending order using COUNTIF, e.g.,

=COUNTIF($A$1:$A$100,"<"&A1)

This will return 0 for the lexically first string and progressively higher
values for lexically subsequent values.

If A1:A5 contained {"d";"ab";"baa";"ca";"c"}, then the array formula

=COUNTIF(A1:A5,"<"&A1:A5)

evaluates to {4;0;1;3;2}. To find the 4th string in the ordered list, use

=INDEX($A$1:$A$5,MATCH(SMALL(COUNTIF($A$1:$A$5,"<"&$A$1:$A$5),4),
COUNTIF($A$1:$A$5,"<"&$A$1:$A$5),0))

no ancillary cells needed. To generate sorted lists of text even more
efficiently, use array formulas like

C1:
=INDEX($A$1:$A$5,MATCH(0,COUNTIF($A$1:$A$5,"<"&$A$1:$A$5),0))

C2:
=IF(COUNTIF($A$1:$A$5,C1)>COUNTIF(C$1:C1,C1),C1,
INDEX($A$1:$A$5,MATCH(ROW()-ROW($C$1),COUNTIF($A$1:$A$5,"<"&$A$1:$A$5),0)))

Select C2 and fill down as far as needed.
 
F

Frank Kabel

[.....]
C1:
=INDEX($A$1:$A$5,MATCH(0,COUNTIF($A$1:$A$5,"<"&$A$1:$A$5),0))

C2:
=IF(COUNTIF($A$1:$A$5,C1)>COUNTIF(C$1:C1,C1),C1,
INDEX($A$1:$A$5,MATCH(ROW()-ROW($C$1),COUNTIF($A$1:$A$5,"<"&$A$1:$A$5),
0)))

Select C2 and fill down as far as needed.

Hi Harlan
nice trick!

Frank
 
C

Captain_Nemo

Harlan -

Thank you. I take it "Why bother?" means "why did I bother to use the
scratch columns?" Because. Because I couldn't see how any array math
would work, though my intuition said there must be some. I asked for an
array implementation, and that is just what the Wizard provided.

Sadly, I still don't quite get it. I take it you're counting if an
array is less than itself in the array formula

=COUNTIF(A1:A5,"<"&A1:A5)

I can almost but not quite warp my mind around this. "Array formula"
means control-entered into a range big enough to hold the answer, which
is a array the first element of which is the count < A1, the 2nd element
is the count < A2 etc?

In the last part, the C2 fill-down, C2 is array-entered, and then
filled-down?

Thanks in advance.
....best, Capt N.

Harlan Grove said:
...
..

Why bother? If you have strings in a range, such as A1:A100, you could rank
them
in ascending order using COUNTIF, e.g.,

=COUNTIF($A$1:$A$100,"<"&A1)

This will return 0 for the lexically first string and progressively higher
values for lexically subsequent values.

If A1:A5 contained {"d";"ab";"baa";"ca";"c"}, then the array formula

=COUNTIF(A1:A5,"<"&A1:A5)

evaluates to {4;0;1;3;2}. To find the 4th string in the ordered list, use

=INDEX($A$1:$A$5,MATCH(SMALL(COUNTIF($A$1:$A$5,"<"&$A$1:$A$5),4),
COUNTIF($A$1:$A$5,"<"&$A$1:$A$5),0))

no ancillary cells needed. To generate sorted lists of text even more
efficiently, use array formulas like

C1:
=INDEX($A$1:$A$5,MATCH(0,COUNTIF($A$1:$A$5,"<"&$A$1:$A$5),0))

C2:
=IF(COUNTIF($A$1:$A$5,C1)>COUNTIF(C$1:C1,C1),C1,
INDEX($A$1:$A$5,MATCH(ROW()-ROW($C$1),COUNTIF($A$1:$A$5,"<"&$A$1:$A$5),0)))

Select C2 and fill down as far as needed.

--
Email to (e-mail address removed) (yes, you can so figure it out) ;-]

Scream and shout and jump for joy! I was here before Kilroy!

Sorry to spoil your little joke. I was here but my computer broke. ---Kilroy
 
H

Harlan Grove

=COUNTIF(A1:A5,"<"&A1:A5)

I can almost but not quite warp my mind around this.

It returns an array of the number of items in the array lexically less than
each item in the array. [F2], [F9] would show how it evaluates.
"Array formula" means control-entered into a range big enough to hold
the answer, . . .

It's [Ctrl]+[Shift]+[Enter], and there are single cell array formulas, and
the formulas I gave were single cell array formulas.
In the last part, the C2 fill-down, C2 is array-entered, and then
filled-down?

Yes, C2 is a single-cell array formula, then filled down.
 
C

Captain_Nemo

Harlan -

Thanks. I see how it goes now.

....best, Capt N.

Harlan Grove said:
=COUNTIF(A1:A5,"<"&A1:A5)

I can almost but not quite warp my mind around this.

It returns an array of the number of items in the array lexically less than
each item in the array. [F2], [F9] would show how it evaluates.
"Array formula" means control-entered into a range big enough to hold
the answer, . . .

It's [Ctrl]+[Shift]+[Enter], and there are single cell array formulas, and
the formulas I gave were single cell array formulas.
In the last part, the C2 fill-down, C2 is array-entered, and then
filled-down?

Yes, C2 is a single-cell array formula, then filled down.

--
Email to (e-mail address removed) (yes, you can so figure it out) ;-]

Scream and shout and jump for joy! I was here before Kilroy!

Sorry to spoil your little joke. I was here but my computer broke. ---Kilroy
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top