Capital Letters

G

Guest

Hi,

In cells c1 to h200 I have long sentences in each cell,
all the letters are in capitals. I need to write a macro
which loops through each cell and changes the letters to small
except:
1)The first letter
2) Any letter after a full stop
3) i

e.g. THE DAY WAS SUNNY AND I WORE A HAT.PETER WAS THERE

Becomes

The day was suuny and I wore a hat.Peter was there

Thanks A Million for any help
 
K

Ken Wright

From a previous post:-

Sub MakeProperCase()
Application.ScreenUpdating = False
Application.Calculation = xlCalculationManual

Dim myCell As Range
Dim myRng As Range

On Error Resume Next
Set myRng = Intersect(Selection, _
Selection.Cells _
.SpecialCells(xlCellTypeConstants, xlTextValues))
On Error GoTo 0

If myRng Is Nothing Then
MsgBox "Please select a range that contains text--no formulas!"
Exit Sub
End If

For Each myCell In myRng.Cells
myCell.Value = StrConv(myCell.Value, vbProperCase)
Next myCell
Application.Calculation = xlCalculationAutomatic
Application.ScreenUpdating = True
End Sub
 
G

Guest

Thanks for this KEn - acouple of things

THE DAY WAS SUNNY AND I WORE A HAT.PETER WAS THERE
becomes....

The Day Was Sunny And I Wore A Hat.Peter Was There

howevr I want:
The day was sunny and I wore a hat.Peter was there
 
G

Guest

Hello Teresa,

I wrote a little UDF for you:

'----------------------------------------------------------------------------------------------
Function sCase(ByVal strIn As String) As String
Dim bArr() As Byte, i As Long, i2 As Long
Let bArr = StrConv(LCase$(strIn), vbFromUnicode)
Select Case bArr(0)
Case 97 To 122
bArr(0) = bArr(0) - 32
End Select
For i = 1 To UBound(bArr)
Select Case bArr(i)
Case 105
If Not i = UBound(bArr) Then
Select Case bArr(i + 1)
Case 33, 39, 44, 46, 58, 59, 63, 148, 160
bArr(i) = bArr(i) - 32
Case 32
If bArr(i - 1) = 32 Then _
bArr(i) = bArr(i) - 32
End Select
ElseIf bArr(i - 1) = 32 Then _
bArr(i) = bArr(i) - 32
End If
Case 33, 46, 58, 63
For i2 = i + 1 To UBound(bArr)
Select Case bArr(i2)
Case 97 To 122
bArr(i2) = bArr(i2) - 32
i = i2
Exit For
End Select
Next
End Select
Next
sCase = StrConv(bArr, vbUnicode)
End Function

Sub testTime()
MsgBox sCase("THE DAY WAS SUNNY AND I WORE A HAT.PETER WAS THERE")
MsgBox sCase("no WorRies, i'm ONLY testIng! yes-no?")
MsgBox sCase("mY fRiend & i")
MsgBox sCase("iiiiiiiiiiiiii")
MsgBox sCase("22 Years.")
End Su
'----------------------------------------------------------------------------------------------

I hope you like it. To date, I have only given the function a limited amount
of thought/testing, so if you find it needs adjusting, please post back.

Regards,
Nate Oliver
 
G

Guest

Hmmm, one possible glitch, a sentence starting with a non-alpha char; I try
to handle with the following adjustment:

'-------------------------------------------------------------------------------------------Function sCase(ByVal strIn As String) As String
Dim bArr() As Byte, i As Long, i2 As Long
Let bArr = StrConv(LCase$(strIn), vbFromUnicode)
Select Case bArr(0)
Case 97 To 122
bArr(0) = bArr(0) - 32
End Select
For i = 1 To UBound(bArr)
Select Case bArr(i)
Case 105
If Not i = UBound(bArr) Then
Select Case bArr(i + 1)
Case 33, 39, 44, 46, 58, 59, 63, 148, 160
bArr(i) = bArr(i) - 32
Case 32
If bArr(i - 1) = 32 Then _
bArr(i) = bArr(i) - 32
End Select
ElseIf bArr(i - 1) = 32 Then _
bArr(i) = bArr(i) - 32
End If
Case 33, 46, 58, 63
For i2 = i + 1 To UBound(bArr)
Select Case bArr(i2)
Case 97 To 122
bArr(i2) = bArr(i2) - 32
i = i2: Exit For
Case Is <> 32
i = i2: Exit For
End Select
Next
End Select
Next
sCase = StrConv(bArr, vbUnicode)
End Function

Sub testTime()
MsgBox sCase("THE DAY WAS SUNNY AND I WORE A HAT.PETER WAS THERE")
MsgBox sCase("no WorRies, i'm ONLY testIng! yes-no?")
MsgBox sCase("mY fRiend & i")
MsgBox sCase("iiiiiiiiiiiiii")
MsgBox sCase("22 Years.")
MsgBox sCase("How Old?! 5 Years.")
MsgBox sCase("****T. %T i @")
End Su
'-------------------------------------------------------------------------------------------

Regards,
Nate Oliver
 
D

David McRitchie

If you would prefer a macro to make the changes in place to constants
Tushar' Mehta's code has worked well for me, It uses regular expressions rather than
checking each letter so would be faster. Like anything else you may have to make
changes for words or acronyms that should be in all caps and for proper nouns.

Sentence_Case, Re: Use VBS RegExp to replace a-z with A-Z?, Tushar Mehta, programming, 2002-08-04. .See Daniel's correction if cell
begins with spaces in a reply.
http://google.co.uk/[email protected]

Some notes on Regular Expressions to understand Tushar's macro.
http://www.mvps.org/dmcritchie/excel/grove_digitsid.htm#regexp

My page on Proper and other letter case changes
http://www.mvps.org/dmcritchie/excel/proper.htm

I have used the google.co.uk link earlier to avoid errors that would likely be introduced
by the disastrous Google Beta Groups. You can read more about that at
http://www.mvps.org/dmcritchie/excel/betagroups.htm
 
G

Guest

Hello David,

With all due respect, I did test the function you mentioned before posting
to this thread, and unless I'm missing something, it doesn't appear to work
on the OP's string, i.e.,

‘-------------------
Sub testIt()
MsgBox CapFirstLetterOfSentences( _
"THE DAY WAS SUNNY AND I WORE A HAT.PETER WAS THERE")
End Sub
‘-------------------

Doesn’t return the desired output. Could someone fix that? Perhaps, however,
it won't be me.

In terms of performance, I have a couple of thoughts.

1) I would be hesitant to pass judgment on the Byte Array approach before
using a hi-resolution timer. Not all linear searches are created equally,
Byte Arrays are light-weight and your processor knows what to do when it
comes to manipulating small, long integers. In my experience, they are
extremely fast, even while they might look slow at a glance. And it raises
the question in my mind, if RegExp isn't checking each character under the
hood, what is it doing? I'm admittedly extremely green when it comes to
RegExp and I have seen some nice implementations.

2) If you're going to call the RegExp function against multiple cells and
you want to compete with the Byte Array for performance, you might have some
work to do. Binding with the respective object on each call is going to cost
you a fair amount. If you wanted to optimize, you'd either want to
demodularize the code into an inline subroutine with a single bind or
reconstruct the function to return an array, again, using a single bind.

Now, I haven't time-tested the SCase() function I have posted, and I have
yet to see the RegExp equivalent, so it's hard to speak to relative
performance at this point.

Getting a procedure to implement the algorithm is the easy part, the trick
is to get an algorithm that works, as desired, up and running.

I made another change after considering the possibility of ongoing
punctuation (e.g., "Hmmm..."). The revision is as follows:

'---------------------
Function sCase(ByVal strIn As String) As String
Dim bArr() As Byte, i As Long, i2 As Long
Let bArr = StrConv(LCase$(strIn), vbFromUnicode)
Select Case bArr(0)
Case 97 To 122
bArr(0) = bArr(0) - 32
End Select
For i = 1 To UBound(bArr)
Select Case bArr(i)
Case 105
If Not i = UBound(bArr) Then
Select Case bArr(i + 1)
Case 33, 39, 44, 46, 58, 59, 63, 148, 160
bArr(i) = bArr(i) - 32
Case 32
If bArr(i - 1) = 32 Then _
bArr(i) = bArr(i) - 32
End Select
ElseIf bArr(i - 1) = 32 Then _
bArr(i) = bArr(i) - 32
End If
Case 33, 46, 58, 63
For i2 = i + 1 To UBound(bArr)
Select Case bArr(i2)
Case 97 To 122
bArr(i2) = bArr(i2) - 32
i = i2: Exit For
End Select
If bArr(i2) <> 32 And bArr(i2) <> 33 And bArr(i2) <> 46 _
And bArr(i2) <> 63 Then
i = i2: Exit For
End If
Next
End Select
Next
sCase = StrConv(bArr, vbUnicode)
End Function

Sub testTime()
Debug.Print sCase$("how old?! 22 Years.")
Debug.Print sCase$("how old?! twenty-two Years.")
Debug.Print sCase$("hmmmm.... wOrking?!?! sam i am. yes-no? isn't i'm isn't.
")
Debug.Print sCase$("THE DAY WAS SUNNY AND I WORE A HAT.PETER WAS THERE")
End Sub
'---------------------

I’m probably missing quite a few other aspects to the language as well...

Change a range? Try the following:

'---------------------
Sub chngRange()
Dim cl As Range
Application.ScreenUpdating = False
For Each cl In Range("c2:h200").SpecialCells(xlConstants, xlTextValues)
cl.Value = sCase(cl.Value)
Next
Application.ScreenUpdating = True
End Sub
'---------------------

I set up c2:h200 with the following string:

"THE DAY WAS SUNNY AND I WORE A HAT.PETER WAS THERE"

The above procedure converted the 1,194 cells in question in ~.1 seconds.
This strikes me as relatively fast, while your results will obviously vary
with the strings in question.

Regards,
Nate Oliver
 
D

David McRitchie

Hi Nate,
WOW. A lot of testing. The main point was that a subroutine run
only once is going to take zero time during recalculations, and I
expect that the user would really prefer to change the data once
and be done with it.

I did not test your function for time or reliability. Yes it is easy to
incorporate the user function into a SUB which you did afterward.
What Excel actually does under the hood is going to run circles
around you using VBA because you are not even close to machine
instructions. In fact there are probably machine instructions for
AND-ind and OR-ing longs stings of bits or at least to separate
lowercase from uppercase letters..

I just tested your function, and it failed (#VALUE!) on the very first cell which
was empty. I know the poster said he started with all caps,
and your encompassing SUB would eliminate that possibility.
..
I would first convert to lowercase with a separate macro myself
and use a few macros as builting blocks. The second macro
would be Tushar's macro. Then looking over results.

I prefer running two macros to having options in macros or lots
of macros that are almost the same. [If the macro is only going
to be run once it isn't going to matter much whether you spend
time looking for one macro that does everything, or use
a couple of macros that will do the trick]

Normally if there is a problem with sentence case someone typed
in something and may have capitalized some names like IBM or
their own name mine is McRitchie. Running Tushar's sentence_case
would not destroy those words, nor will it force any Capital letter
to lower case.

I'm not saying that one macro is going to fit everyone's needs,
but was just offering an alternative that I know works for me, and
I was surprised how well regular expressions worked when I saw
Tushar's solution. I immediately changed reference to Harald Staff's
solution that I thought worked fine until then.

I just took a look at your SUB and noticed it had a hard coded
range, which is certainly not kind of thing that I would do in a
general purpose macro. Was actually looking to see if someone
selected only one cell if it would mess up the entire sheet.
See my proper.htm page for selection. Don't remember if
the poster posted a specific range, but I doubt that they would
really want such an absolute restriction.


Nate Oliver said:
With all due respect, I did test the function you mentioned before posting
to this thread, and unless I'm missing something, it doesn't appear to work
on the OP's string, i.e.,
te Oliver
 
P

Peter T

Hi Nate,

I saw the OP's poser come in and wrote a function of my own. Was about to
post it, saw yours and felt confident mine would be faster. But quite the
reverse! so I'll keep very quiet about mine <g>. Actually I'm surprised just
how fast yours is - I like it!

However I think you will find this little wrapper for your function will
make a speed things up further, particularly for many cells:

Sub test()
Dim r As Long, c As Long
Dim rng As Range
Dim vCells
Set rng = Range("C1:H200")
'rng.Value = "THE DAY WAS SUNNY AND I WORE A HAT.PETER WAS THERE"
vCells = rng.Value

For r = 1 To UBound(vCells)
For c = 1 To UBound(vCells, 2)
If Len(vCells(r, c)) Then
'any other pre-tests ?
vCells(r, c) = sCase(vCells(r, c))
End If
Next
Next
rng.Value = vCells
End Sub

Regards,
Peter T
 
M

Myrna Larson

Does it work like this (I haven't tried it)?

Sub TestIt()
MsgBox CapFirstLetterOfSentences( _
LCase$("THE DAY WAS SUNNY AND I WORE A HAT.PETER WAS THERE"))
End Sub
 
G

Guest

Yes, it does Myrna, which David's post lead me to understand; my thanks to
the both of you.

This would have a negative impact on IBM and/or McRitchie, which may be
unavoidable outside of setting up an array of key terms given the all-cap
starting point.

And, I chose not to deal with proper nouns or acronyms, in that sense, the
function is (very?) incomplete.

I did a little testing said:
I just tested your function, and it failed (#VALUE!) on the very first cell which
was empty. I know the poster said he started with all caps,
and your encompassing SUB would eliminate that possibility.

Quite right, a slight oversight on my part! And a quick fix:

'-------------------
Function sCase(ByVal strIn As String) As String
Dim bArr() As Byte, I As Long, i2 As Long
If strIn = vbNullString Then Exit Function
Let bArr = StrConv(LCase$(strIn), vbFromUnicode)
Select Case bArr(0)
Case 97 To 122
bArr(0) = bArr(0) - 32
End Select
For I = 1 To UBound(bArr)
Select Case bArr(I)
Case 105
If Not I = UBound(bArr) Then
Select Case bArr(I + 1)
Case 33, 39, 44, 46, 58, 59, 63, 148, 160
bArr(I) = bArr(I) - 32
Case 32
If bArr(I - 1) = 32 Then _
bArr(I) = bArr(I) - 32
End Select
ElseIf bArr(I - 1) = 32 Then _
bArr(I) = bArr(I) - 32
End If
Case 33, 46, 58, 63
For i2 = I + 1 To UBound(bArr)
Select Case bArr(i2)
Case 97 To 122
bArr(i2) = bArr(i2) - 32
I = i2: Exit For
End Select
If bArr(i2) <> 32 And bArr(i2) <> 33 And bArr(i2) <> 46 _
And bArr(i2) <> 63 Then
I = i2: Exit For
End If
Next
End Select
Next
sCase = StrConv(bArr, vbUnicode)
End Function
'-------------------

Thanks.

The range I hard-coded was the specified range by the OP, and you're
correct, one could very easily make this dynamic without too much effort.

You could convert SCase() to an inline sub or crank out an array as Peter T
has done, or disable recalculations temporarily. The algorithm itself is the
most time intensive consideration I suspect.

Thanks for the feedback Peter, glad to hear you like it. Byte Arrays are
pretty efficient. And I didn't attempt to optimize that sub procedure, but
indeed, looping through an Array should be much faster than looping through a
Range if they're sized equally.

The speed of using a Byte Array caught my attention one day I when I decided
I was going to crop an MP3 file with Excel. So I wrote a procedure using
Binary File Access and a couple of Byte Arrays:

http://mrexcel.com/board2/viewtopic.php?p=306631#306631

(I'm not sure that particular code is optimized...) In any case, I thought I
was going to be sitting around all day, but not at all. I didn't even need a
timer to see how fast these things are. I just did it again on a 7.5 MB file,
while the procedure loops 7.5 million times (used Currency variables), I cut
the file in half in 16 seconds. This strikes me as fast.

I'm not necessarily saying you should use RegExp or Byte arrays over one
another, in fact I've seen string parsing on very large strings performed
more efficiently with RegExp that I could match with the Byte Array. It's
hard to get around some of the overhead with StrConv() and that VBA is a
high-level and slower language.

I just thought I'd write a UDF for Teresa that I knew would be pretty
quick-like and fairly robust.

Cheers,
Nate Oliver
 
G

Guest

One more revision, I mishandled an 'I':

'------------------
Function sCase(ByVal strIn As String) As String
Dim bArr() As Byte, I As Long, i2 As Long
If strIn = vbNullString Then Exit Function
Let bArr = StrConv(LCase$(strIn), vbFromUnicode)
Select Case bArr(0)
Case 97 To 122
bArr(0) = bArr(0) - 32
End Select
For I = 1 To UBound(bArr)
Select Case bArr(I)
Case 105
If Not I = UBound(bArr) Then
Select Case bArr(I + 1)
Case 32, 33, 39, 44, 46, 58, 59, 63, 148, 160
If bArr(I - 1) = 32 Then _
bArr(I) = bArr(I) - 32
End Select
ElseIf bArr(I - 1) = 32 Then _
bArr(I) = bArr(I) - 32
End If
Case 33, 46, 58, 63
For i2 = I + 1 To UBound(bArr)
Select Case bArr(i2)
Case 97 To 122
bArr(i2) = bArr(i2) - 32
I = i2: Exit For
End Select
If bArr(i2) <> 32 And bArr(i2) <> 33 And bArr(i2) <> 46 _
And bArr(i2) <> 63 Then
I = i2: Exit For
End If
Next
End Select
Next
sCase = StrConv(bArr, vbUnicode)
End Function

Sub testTime()
Debug.Print sCase("hello? erm, i'M only testing, eh. indeed, " & _
"i am inquisitve.")
Debug.Print sCase$("how old?! 22 Years.")
Debug.Print sCase$("how old?! twenty-two Years.")
Debug.Print sCase$("hmmmm.... wOrking?!?! sam i am. yes-no? " & _
"isn't i'm isn't.")
Debug.Print sCase("THE DAY WAS SUNNY AND I WORE A HAT.PETER WAS THERE")
Debug.Print sCase("no WorRies, i'm ONLY testIng! yes-no?")
Debug.Print sCase("mY fRiend & i")
Debug.Print sCase("iiiiiiiiiiiiii")
Debug.Print sCase("****T. toast %T i @")
Debug.Print sCase("re: sentences.")
End Sub
'------------------

Regards,
Nate Oliver
 
P

Peter T

Nate - for your interest I compared times calling your function and Tushar
Mehta's CapFirstLetterOfSentences with vbscript.regexp (in the link referred
to earlier by David McRitchie). I called each with the routine I posted
earlier, range of 1200 strings using range > array > loop > array > range.

Your method was 30 x faster.

But this wasn't fair, as in the function a late binding reference to regexp
is set 1200 times. So I converted it to an in line sub using similar range >
array > loop > array > range.

Much faster but still 1.6 times slower than your function, which is probably
inconsequential. I don't know enough about either method to form an overall
opinion, no doubt pros/cons to each. But interesting!

Tushar's function as posted only converts first character to a capital, but
I think not relevant to this comparison. However purely for doing that, I
found it 3 x faster to do simply:
s = UCase(Left$(s, 1)) & LCase(Mid$(s, 2, Len(s)))

Regards,
Peter T
 
G

Guest

Hello again Peter,

Thanks for testing!

I’ve done some comparisons on more comparable RegExp versus Byte Array
functions with respect to string parsing and my findings have been similar.
Much faster but still 1.6 times slower than your function, which is probably inconsequential.

I don’t see how it could be inconsequential. One of my goals when writing a
UDF is to write an efficient function, so for me it’s of consequence.

It also serves to alleviate confusion regarding the following untested and
inaccurate testimony that found its way into this thread:

And to clear up further confusion, it’s Regular Expression working on the
strings under the hood, not Excel. No, no and no.

The reason I say the two functions are not necessarily comparable is that
the algorithm I’ve posted is more robust. It handles varying punctuation and
I’s (which the OP specifically asked for), e.g.,

MsgBox CapFirstLetterOfSentences( _
LCase$("hello? hi! just me, myself and i, while i'm testing."))
MsgBox sCase("hello? hi! just me, myself and i, while i'm testing.")

Surely Sentence Casing does not consider full stops alone?

Could one layer in this functionality with the RegExp approach? Perhaps, but
I doubt that would speed it up. So not only is sCase() as posted more
efficient, it’s more robust, to the extent where comparing the two might not
even make sense.
Tushar's function as posted only converts first character to a capital, but
I think not relevant to this comparison. However purely for doing that, I
found it 3 x faster to do simply:
s = UCase(Left$(s, 1)) & LCase(Mid$(s, 2, Len(s)))

Not exactly, that function anticipates and effectively deals with multiple
sentences with full stops, which the OP had and inquired about. Try the
following:

Const TST_STR As String = "SENTENCE ONE. SENTENCE TWO. SENTENCE THREE."
MsgBox CapFirstLetterOfSentences(LCase$(TST_STR))
MsgBox UCase(Left$(TST_STR, 1)) & LCase(Mid$(TST_STR, 2, Len(TST_STR)))
MsgBox sCase(TST_STR)

As I mentioned prior to my recent adjustments to sCase(), it’s been given
limited thought and testing, but here’s what I see:

-It’s very fast
-It’s flexible/robust
-It offers 100% compatibility for Excel users
-It places no reliance on 3rd party objects/it's native
-Reuse is a snap
-Maintenance is straightforward

It begs the question, what else does one want in String parsing function?

Have a nice weekend.

Regards,
Nate Oliver
 
P

Peter T

Hi Nate,

Thanks for your follow up, I agree with some of your points but not entirely
<g>. You had already sold me on the idea of the Byte Array method but I
could still be a buyer of RegExp.

Re: my comment "1.6 x slower [byte vs regexp] is probably inconsequential"
and your counter. I should have said "could be inconsequential", I had
several things in mind -

Despite my allowances for the references, my limited test was still not
quite comparable, Tushar's routine also and necessarily includes some string
functions.

1.6 x slower than extremely fast is still extremely fast. With small qty's
of strings
(say 1200) and called occasionally, to me, then it is inconsequential.
That's not the same as insignificant but other factors become more
important. It's a kind of Quick vs Bubble Sort choice.

I suspect, but don't yet know, there may be occasions when regexp would be
the preferred method. But with my limited knowledge of regexp I would elect
for your Byte method.

Why - I've used RegExp just few times between long intervals. Each time I
need to re-learn the pattern syntax from scratch, and that takes me a very
long time to write and test! Another thing - due to the delay it takes to
reference regexp, it would require passing multiple strings as an array and
looping within the function. Makes it slightly less portable. By contrast,
once the Byte Array method is understood it can be worked out from first
principals - no re-learning, also some of the other advantages you
mentioned.

However - and you're not going to like this! - for the task given by the OP
and limited calls, I would use my string function (the one I mentioned but
didn't post). It's between almost zero to 3 x slower than your method
(depends on string & corrections), but has its advantages where ultimate
speed is not a prerequisite:

Why - I understand it well, only took a few minutes to write, and rightly or
lazily to me that's a consideration. Corrections and acronyms etc are easy
to include, even by a VBA novice later without needing to understand the
entire function (I think), albeit at some degradation in performance. In
other words easy to maintain.

I suppose I ought to put up or shut up, so I'll suffer the consequences of
you pointing out errors or limitations and post below.

I'm very pleased you posted your Byte function, can only wonder if the OP
is!

Regards,
Peter T

If viewing in Googal proportional font, suggest switching to fixed font, or
Options > show original.
Public Declare Function GetTickCount Lib "kernel32.dll" () As Long

Sub Test_SentCap()
Dim r As Long, c As Long
Dim rng As Range
Dim vCells

Dim t1 As Long, t2 As Long
Dim str As String, i As Byte
Dim aTxt(1 To 12) 'As String

aTxt(1) = "this is the 1ST sentence. second sentence. This " _
& "requires no caps.a sentence with no leading spaces." _
& " sentence with 10 leading spaces"
aTxt(2) = "hello? erm, i'M only testing, eh. " _
& "indeed, i am inquisitve."
aTxt(3) = "how old?! 22 Years."
aTxt(4) = "how old?! twenty-two Years."
aTxt(5) = "hmmmm.... wOrking?!?! sam i am. yes-no? isn't i'm isn't."
aTxt(6) = "THE DAY WAS SUNNY AND I WORE A HAT.PETER WAS THERE"
aTxt(7) = "no WorRies, i'm ONLY testIng! yes-no?"
aTxt(8) = "mY fRiend & i"
aTxt(9) = "iiiiiiiiiiiiii"
aTxt(10) = "****T. toast %T i @"
aTxt(11) = "re: sentences." 'doesn't yet correct re: s > Re: S
aTxt(12) = "HI MR OLIVER. IT'S I, PETER"

Set rng = Range("C1:h200") '1200 cells

For i = 1 To 12
rng.Clear
rng.Value = aTxt(i)
'Stop ' and have a look?
t1 = GetTickCount
vCells = rng.Value

t2 = GetTickCount
For r = 1 To UBound(vCells)
For c = 1 To UBound(vCells, 2)
SentCap vCells(r, c) ', True
Next
Next
t2 = GetTickCount - t2

rng.Value = vCells
t1 = GetTickCount - t1
'Stop ' and have a look?
Debug.Print aTxt(i)
Debug.Print vCells(1, 1)
Debug.Print "Overall " & t1 / 1000, , "Function " & t2 / 1000
Debug.Print

Next
Set rng = Nothing: Erase vCells
End Sub

Function SentCap(vs) As Boolean
Dim b As Boolean
Dim i As Long, k As Long, naCnt As Byte
Dim n As Long, t As Long
Dim nLen As Long
Dim aCorrect() As String
Const cSPACE As String = " "
Const cDOT As String = "."

'A few simple corrections
naCnt = 8 ' count of corrections
ReDim aCorrect(1 To naCnt, 0 To 1) As String

'could maintain this array as Static & erase when
'done, if called as a big loop.
'or pass as an argument for special purposes
aCorrect(1, 0) = " i ": aCorrect(1, 1) = " I "
aCorrect(2, 0) = " i.": aCorrect(2, 1) = " I."
aCorrect(3, 0) = " i,": aCorrect(3, 1) = " I,"
aCorrect(4, 0) = " i'": aCorrect(4, 1) = " I'"
aCorrect(5, 0) = "mr": aCorrect(5, 1) = "Mr"
aCorrect(6, 0) = "mrs": aCorrect(6, 1) = "Mrs"
aCorrect(7, 0) = "oliver": aCorrect(7, 1) = "Oliver"
aCorrect(8, 0) = "peter": aCorrect(8, 1) = "Peter"
' careful of names/acronyms that could be suffix or prefix
' or common words, eg "miss", "nate"

vs = CStr(vs)
nLen = Len(vs)
n = 1: t = 1
ReDim va(0) 'array for each sentance

'split each sentance ending in DOT
' (probably easier to use vba's Split function)
Do While n > 0
n = InStr(n, vs, cDOT)
If n Then
'include any more spaces after the DOT
b = True
Do While (b)
b = Mid$(vs, n + 1, 1) = cSPACE
If b Then
n = n + 1
End If
Loop

If cnt Then ReDim Preserve va(cnt)
va(cnt) = Mid(vs, t, n - t + 1)
cnt = cnt + 1
n = n + 1
t = n
End If
Loop

If t - 1 < nLen Then
'in case last sentance doesn't end with a DOT
If cnt Then ReDim Preserve va(cnt + 1)
va(cnt) = Mid$(vs, t, nLen - t + 1)
End If

For i = 0 To UBound(va)
'process each sentance, starting with initial capital
va(i) = UCase$(Left$(va(i), 1)) & LCase$(Mid$(va(i), 2, Len(va(i))))

'now the corrections
For k = 1 To naCnt
If InStr(1, va(i), aCorrect(k, 0)) Then
#If VBA6 Then
va(i) = Replace(va(i), aCorrect(k, 0), aCorrect(k, 1))
#Else
'for Excel 97
va(i) = Application.Substitute(va(i), " i ", " I ")
#End If
End If
Next

'any more corrections, eg trailing "I" without punctuation
If Right$(va(i), 2) = " i" Then
va(i) = Left$(va(i), Len(va(i)) - 2) & " I"
End If

Next

're-join the sentances
'this loop seems as fast as "vs = Join(va)" & OK for Excel 97
For i = 1 To UBound(va)
va(0) = va(0) & va(i)
Next
vs = va(0)

End Function
 
D

David McRitchie

Here is Nate's code converted to a just a macro which would
be much faster. I really can't see someone wanting to convert
one cell on a workbook based on another cell for this kind of thing.

Though I think the usual requirement would be that the first letter
and/or first letter after a period (or ? or !) would all that would
need to be changed so I would leave out the initial conversion to
lowercase, except that that was a requirement of the poster.
For which I previously indicated I would have used a separate
macro to convert everything to lowercase first if really needed
Rather than destroying existing proper nouns.

As a straight macro it will run much faster than the macro calling
the function for each cell. Time comparisons at end at end.

Don't know what was to be tested with checking for an ending
(close) double quotes.

Sub sCase_mac()
' Sentence case, Nate Oliver, 2005-03-17
' http://groups.google.co.uk/groups?threadm=B8548560-5310-4C5E-A3E9-2CD1F94431CB@microsoft.com
Application.ScreenUpdating = False
Application.Calculation = xlCalculationManual
Dim bArr() As Byte, i As Long, i2 As Long
Dim cell As Range
For Each cell In Intersect(Selection, _
Selection.SpecialCells(xlConstants, xlTextValues))
If cell.Value = vbNullString Then GoTo nextcell
Let bArr = StrConv(LCase$(cell.Value), vbFromUnicode)
Select Case bArr(0)
Case 97 To 122 'lowercase a-z
bArr(0) = bArr(0) - 32
End Select
For i = 1 To UBound(bArr)
Select Case bArr(i)
Case 105 'lowercase i, going for I'm
If Not i = UBound(bArr) Then
Select Case bArr(i + 1)
' space ! ' , . : ; ? rdquo nbsp
Case 32, 33, 39, 44, 46, 58, 59, 63, 148, 160
If bArr(i - 1) = 32 Then _
bArr(i) = bArr(i) - 32
End Select
ElseIf bArr(i - 1) = 32 Then _
bArr(i) = bArr(i) - 32
End If
Case 33, 46, 58, 63 '-- ! . : ?
For i2 = i + 1 To UBound(bArr)
Select Case bArr(i2)
Case 97 To 122
bArr(i2) = bArr(i2) - 32
i = i2: Exit For
End Select
If bArr(i2) <> 32 And bArr(i2) <> 33 And bArr(i2) <> 46 _
And bArr(i2) <> 63 Then
i = i2: Exit For
End If
Next
End Select
Next i
cell.Value = StrConv(bArr, vbUnicode)
nextcell:
Next cell
Application.Calculation = xlCalculationAutomatic
Application.ScreenUpdating = True
End Sub

Timing for calling the RegExpr included the following
between the begin and end timing points. But no attempt
was made to handle end of sentences, by question mark,
and explanation point, which Nate's macros has covered.

Selection.Replace what:=" i'm ", replacement:=" I'm ", _
lookat:=xlPart, searchorder:=xlByRows, MatchCase:=False

Testing: (about 700 cells each in each test.)
Latest test at top, averages at bottom.

RegExpr calls Funct. strictly Macro
0.341 secs., 0.280 secs., 0.200 secs.,
0.341 secs., 0.280 secs., 0.210 secs.,
0.351 secs., 0.280 secs., 0.201 secs.,
0.351 secs., 0.280 secs., 0.201 secs.,
0.350 secs., 0.290 secs., 0.221 secs.,
0.360 secs., 0.281 secs., 0.210 secs.,
0.350 secs., 0.301 secs., 0.210 secs.,
0.350 secs., 0.291 secs., 0.220 secs.,
0.360 secs., 0.291 secs., 0.200 secs.,
0.390 secs., 0.301 secs., 0.210 secs.,
0.481 secs., 0.401 secs., 0.220 secs.,
0.481 secs., 0.410 secs., 0.211 secs.,
0.491 secs., 0.400 secs., 0.200 secs.,
0.480 secs., 0.411 secs., 0.210 secs.,
0.481 secs., 0.411 secs., 0.200 secs.,
0.491 secs., 0.410 secs., 0.211 secs.,
0.490 secs., 0.411 secs., 0.210 secs.,
0.501 secs., 0.421 secs., 0.200 secs.,
0.491 secs., 0.410 secs., 0.211 secs.,
0.490 secs., 0.421 secs., 0.200 secs.,
0.491 secs., 0.421 secs., 0.210 secs.,
averages--------------------------------------------------
0.424 secs., 0.352 secs., 0.208 secs.



Peter T said:
Hi Nate,

Thanks for your follow up, I agree with some of your points but not entirely
<g>. You had already sold me on the idea of the Byte Array method but I
could still be a buyer of RegExp.

Re: my comment "1.6 x slower [byte vs regexp] is probably inconsequential"
and your counter. I should have said "could be inconsequential", I had
several things in mind -

Despite my allowances for the references, my limited test was still not
quite comparable, Tushar's routine also and necessarily includes some string
functions.

1.6 x slower than extremely fast is still extremely fast. With small qty's
of strings
(say 1200) and called occasionally, to me, then it is inconsequential.
That's not the same as insignificant but other factors become more
important. It's a kind of Quick vs Bubble Sort choice.

I suspect, but don't yet know, there may be occasions when regexp would be
the preferred method. But with my limited knowledge of regexp I would elect
for your Byte method.

Why - I've used RegExp just few times between long intervals. Each time I
need to re-learn the pattern syntax from scratch, and that takes me a very
long time to write and test! Another thing - due to the delay it takes to
reference regexp, it would require passing multiple strings as an array and
looping within the function. Makes it slightly less portable. By contrast,
once the Byte Array method is understood it can be worked out from first
principals - no re-learning, also some of the other advantages you
mentioned.

However - and you're not going to like this! - for the task given by the OP
and limited calls, I would use my string function (the one I mentioned but
didn't post). It's between almost zero to 3 x slower than your method
(depends on string & corrections), but has its advantages where ultimate
speed is not a prerequisite:

Why - I understand it well, only took a few minutes to write, and rightly or
lazily to me that's a consideration. Corrections and acronyms etc are easy
to include, even by a VBA novice later without needing to understand the
entire function (I think), albeit at some degradation in performance. In
other words easy to maintain.

I suppose I ought to put up or shut up, so I'll suffer the consequences of
you pointing out errors or limitations and post below.

I'm very pleased you posted your Byte function, can only wonder if the OP
is!

Regards,
Peter T

If viewing in Googal proportional font, suggest switching to fixed font, or
Options > show original.
Public Declare Function GetTickCount Lib "kernel32.dll" () As Long

Sub Test_SentCap()
Dim r As Long, c As Long
Dim rng As Range
Dim vCells

Dim t1 As Long, t2 As Long
Dim str As String, i As Byte
Dim aTxt(1 To 12) 'As String

aTxt(1) = "this is the 1ST sentence. second sentence. This " _
& "requires no caps.a sentence with no leading spaces." _
& " sentence with 10 leading spaces"
aTxt(2) = "hello? erm, i'M only testing, eh. " _
& "indeed, i am inquisitve."
aTxt(3) = "how old?! 22 Years."
aTxt(4) = "how old?! twenty-two Years."
aTxt(5) = "hmmmm.... wOrking?!?! sam i am. yes-no? isn't i'm isn't."
aTxt(6) = "THE DAY WAS SUNNY AND I WORE A HAT.PETER WAS THERE"
aTxt(7) = "no WorRies, i'm ONLY testIng! yes-no?"
aTxt(8) = "mY fRiend & i"
aTxt(9) = "iiiiiiiiiiiiii"
aTxt(10) = "****T. toast %T i @"
aTxt(11) = "re: sentences." 'doesn't yet correct re: s > Re: S
aTxt(12) = "HI MR OLIVER. IT'S I, PETER"

Set rng = Range("C1:h200") '1200 cells

For i = 1 To 12
rng.Clear
rng.Value = aTxt(i)
'Stop ' and have a look?
t1 = GetTickCount
vCells = rng.Value

t2 = GetTickCount
For r = 1 To UBound(vCells)
For c = 1 To UBound(vCells, 2)
SentCap vCells(r, c) ', True
Next
Next
t2 = GetTickCount - t2

rng.Value = vCells
t1 = GetTickCount - t1
'Stop ' and have a look?
Debug.Print aTxt(i)
Debug.Print vCells(1, 1)
Debug.Print "Overall " & t1 / 1000, , "Function " & t2 / 1000
Debug.Print

Next
Set rng = Nothing: Erase vCells
End Sub

Function SentCap(vs) As Boolean
Dim b As Boolean
Dim i As Long, k As Long, naCnt As Byte
Dim n As Long, t As Long
Dim nLen As Long
Dim aCorrect() As String
Const cSPACE As String = " "
Const cDOT As String = "."

'A few simple corrections
naCnt = 8 ' count of corrections
ReDim aCorrect(1 To naCnt, 0 To 1) As String

'could maintain this array as Static & erase when
'done, if called as a big loop.
'or pass as an argument for special purposes
aCorrect(1, 0) = " i ": aCorrect(1, 1) = " I "
aCorrect(2, 0) = " i.": aCorrect(2, 1) = " I."
aCorrect(3, 0) = " i,": aCorrect(3, 1) = " I,"
aCorrect(4, 0) = " i'": aCorrect(4, 1) = " I'"
aCorrect(5, 0) = "mr": aCorrect(5, 1) = "Mr"
aCorrect(6, 0) = "mrs": aCorrect(6, 1) = "Mrs"
aCorrect(7, 0) = "oliver": aCorrect(7, 1) = "Oliver"
aCorrect(8, 0) = "peter": aCorrect(8, 1) = "Peter"
' careful of names/acronyms that could be suffix or prefix
' or common words, eg "miss", "nate"

vs = CStr(vs)
nLen = Len(vs)
n = 1: t = 1
ReDim va(0) 'array for each sentance

'split each sentance ending in DOT
' (probably easier to use vba's Split function)
Do While n > 0
n = InStr(n, vs, cDOT)
If n Then
'include any more spaces after the DOT
b = True
Do While (b)
b = Mid$(vs, n + 1, 1) = cSPACE
If b Then
n = n + 1
End If
Loop

If cnt Then ReDim Preserve va(cnt)
va(cnt) = Mid(vs, t, n - t + 1)
cnt = cnt + 1
n = n + 1
t = n
End If
Loop

If t - 1 < nLen Then
'in case last sentance doesn't end with a DOT
If cnt Then ReDim Preserve va(cnt + 1)
va(cnt) = Mid$(vs, t, nLen - t + 1)
End If

For i = 0 To UBound(va)
'process each sentance, starting with initial capital
va(i) = UCase$(Left$(va(i), 1)) & LCase$(Mid$(va(i), 2, Len(va(i))))

'now the corrections
For k = 1 To naCnt
If InStr(1, va(i), aCorrect(k, 0)) Then
#If VBA6 Then
va(i) = Replace(va(i), aCorrect(k, 0), aCorrect(k, 1))
#Else
'for Excel 97
va(i) = Application.Substitute(va(i), " i ", " I ")
#End If
End If
Next

'any more corrections, eg trailing "I" without punctuation
If Right$(va(i), 2) = " i" Then
va(i) = Left$(va(i), Len(va(i)) - 2) & " I"
End If

Next

're-join the sentances
'this loop seems as fast as "vs = Join(va)" & OK for Excel 97
For i = 1 To UBound(va)
va(0) = va(0) & va(i)
Next
vs = va(0)

End Function
 
P

Peter T

Hi David,

It was good of you to post your adaptation of Nate's Byte Array function and
test comparisons.
I really can't see someone wanting to convert
one cell on a workbook based on another cell for this kind of thing.

Afraid I don't follow, as I understand it the function and your sub only
convert existing cells. Possibly I've missed something from earlier in the
thread.
As a straight macro it will run much faster than the macro calling
the function for each cell.

I couldn't replicate your time difference's, straight macro vs called
function. In my testing only minimal difference. Whatever, the sub or
function could still be "in line" with the entire range or array is passed
as an argument, and looped within. Would avoid hardcoding the range within
the sub/func, or restricting to a user selection, to give greater
flexibility of usage.

I found by far the greatest consumption of time is the read write from cells
with "For each Cell", much more than the actual processing. Which is why I
favour range > array > loop > array > range with larger volumes.

I'm starting to use StrConv & Byte array for some other related purposes.
FWIW reading help suggests it won't work in Mac.

Regards,
Peter T
 
D

David McRitchie

What I mean is I can't see anyone wanting to use a function
on the workbook to return the value of a cell large enough to
include an entire sentence. I would think someone would only
want to use a macro and change the values of the selected
cells in place.

A function only returns a value. The point is I can't see using
a function like that within the worksheet. It usually means
copying the result, pasting back as values and then deleting
the original column -- waste of time.

If you want I can email you my test.
 
P

Peter T

Hi David,

I see what you meant now. I hadn't envisaged this being used as a UDF,
though of course it could. Rather, as a function (or sub with arguments) in
tandem with another sub or macro, when the "cell" or an element from an
array can be passed ByRef.

This is how I tested and I guess accounts for the differences in our
comparisons. I noticed only about 2% longer between using single macro and
multiple calls to the function.

However if I've misunderstood I would be very pleased to see your test and
understand our different results. It's a shame to waste all the efforts of
an efficient routine by using it inefficiently.

Thanks,
Peter T
pmbthornton at gmail dot com
 
D

David McRitchie

Hi Peter,
Yes, I think we agree. I was a bit surprised that the
there wasn't a lot of overhead in calling a function for
each cell in a macro, which you also indicate to be the
case. Don't know if it makes much of a difference but
the function is in the same module as the macro.

More information about the timing in on my slowresp.htm
Excel page.

Actually I meant to indicate to email me if you wanted the
workbook. But since we now appear to be in agreement
was better that you replied here.

I've sent my test workbook to you. Data of interest is below
the timing rows, but all the rows go through the testing including
previous timings. I used the avg on the task bar to show
averages in my posting.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top