Here is Nate's code converted to a just a macro which would
be much faster. I really can't see someone wanting to convert
one cell on a workbook based on another cell for this kind of thing.
Though I think the usual requirement would be that the first letter
and/or first letter after a period (or ? or !) would all that would
need to be changed so I would leave out the initial conversion to
lowercase, except that that was a requirement of the poster.
For which I previously indicated I would have used a separate
macro to convert everything to lowercase first if really needed
Rather than destroying existing proper nouns.
As a straight macro it will run much faster than the macro calling
the function for each cell. Time comparisons at end at end.
Don't know what was to be tested with checking for an ending
(close) double quotes.
Sub sCase_mac()
' Sentence case, Nate Oliver, 2005-03-17
'
http://groups.google.co.uk/groups?threadm=B8548560-5310-4C5E-A3E9-2CD1F94431CB@microsoft.com
Application.ScreenUpdating = False
Application.Calculation = xlCalculationManual
Dim bArr() As Byte, i As Long, i2 As Long
Dim cell As Range
For Each cell In Intersect(Selection, _
Selection.SpecialCells(xlConstants, xlTextValues))
If cell.Value = vbNullString Then GoTo nextcell
Let bArr = StrConv(LCase$(cell.Value), vbFromUnicode)
Select Case bArr(0)
Case 97 To 122 'lowercase a-z
bArr(0) = bArr(0) - 32
End Select
For i = 1 To UBound(bArr)
Select Case bArr(i)
Case 105 'lowercase i, going for I'm
If Not i = UBound(bArr) Then
Select Case bArr(i + 1)
' space ! ' , . : ; ? rdquo nbsp
Case 32, 33, 39, 44, 46, 58, 59, 63, 148, 160
If bArr(i - 1) = 32 Then _
bArr(i) = bArr(i) - 32
End Select
ElseIf bArr(i - 1) = 32 Then _
bArr(i) = bArr(i) - 32
End If
Case 33, 46, 58, 63 '-- ! . : ?
For i2 = i + 1 To UBound(bArr)
Select Case bArr(i2)
Case 97 To 122
bArr(i2) = bArr(i2) - 32
i = i2: Exit For
End Select
If bArr(i2) <> 32 And bArr(i2) <> 33 And bArr(i2) <> 46 _
And bArr(i2) <> 63 Then
i = i2: Exit For
End If
Next
End Select
Next i
cell.Value = StrConv(bArr, vbUnicode)
nextcell:
Next cell
Application.Calculation = xlCalculationAutomatic
Application.ScreenUpdating = True
End Sub
Timing for calling the RegExpr included the following
between the begin and end timing points. But no attempt
was made to handle end of sentences, by question mark,
and explanation point, which Nate's macros has covered.
Selection.Replace what:=" i'm ", replacement:=" I'm ", _
lookat:=xlPart, searchorder:=xlByRows, MatchCase:=False
Testing: (about 700 cells each in each test.)
Latest test at top, averages at bottom.
RegExpr calls Funct. strictly Macro
0.341 secs., 0.280 secs., 0.200 secs.,
0.341 secs., 0.280 secs., 0.210 secs.,
0.351 secs., 0.280 secs., 0.201 secs.,
0.351 secs., 0.280 secs., 0.201 secs.,
0.350 secs., 0.290 secs., 0.221 secs.,
0.360 secs., 0.281 secs., 0.210 secs.,
0.350 secs., 0.301 secs., 0.210 secs.,
0.350 secs., 0.291 secs., 0.220 secs.,
0.360 secs., 0.291 secs., 0.200 secs.,
0.390 secs., 0.301 secs., 0.210 secs.,
0.481 secs., 0.401 secs., 0.220 secs.,
0.481 secs., 0.410 secs., 0.211 secs.,
0.491 secs., 0.400 secs., 0.200 secs.,
0.480 secs., 0.411 secs., 0.210 secs.,
0.481 secs., 0.411 secs., 0.200 secs.,
0.491 secs., 0.410 secs., 0.211 secs.,
0.490 secs., 0.411 secs., 0.210 secs.,
0.501 secs., 0.421 secs., 0.200 secs.,
0.491 secs., 0.410 secs., 0.211 secs.,
0.490 secs., 0.421 secs., 0.200 secs.,
0.491 secs., 0.421 secs., 0.210 secs.,
averages--------------------------------------------------
0.424 secs., 0.352 secs., 0.208 secs.
Peter T said:
Hi Nate,
Thanks for your follow up, I agree with some of your points but not entirely
<g>. You had already sold me on the idea of the Byte Array method but I
could still be a buyer of RegExp.
Re: my comment "1.6 x slower [byte vs regexp] is probably inconsequential"
and your counter. I should have said "could be inconsequential", I had
several things in mind -
Despite my allowances for the references, my limited test was still not
quite comparable, Tushar's routine also and necessarily includes some string
functions.
1.6 x slower than extremely fast is still extremely fast. With small qty's
of strings
(say 1200) and called occasionally, to me, then it is inconsequential.
That's not the same as insignificant but other factors become more
important. It's a kind of Quick vs Bubble Sort choice.
I suspect, but don't yet know, there may be occasions when regexp would be
the preferred method. But with my limited knowledge of regexp I would elect
for your Byte method.
Why - I've used RegExp just few times between long intervals. Each time I
need to re-learn the pattern syntax from scratch, and that takes me a very
long time to write and test! Another thing - due to the delay it takes to
reference regexp, it would require passing multiple strings as an array and
looping within the function. Makes it slightly less portable. By contrast,
once the Byte Array method is understood it can be worked out from first
principals - no re-learning, also some of the other advantages you
mentioned.
However - and you're not going to like this! - for the task given by the OP
and limited calls, I would use my string function (the one I mentioned but
didn't post). It's between almost zero to 3 x slower than your method
(depends on string & corrections), but has its advantages where ultimate
speed is not a prerequisite:
Why - I understand it well, only took a few minutes to write, and rightly or
lazily to me that's a consideration. Corrections and acronyms etc are easy
to include, even by a VBA novice later without needing to understand the
entire function (I think), albeit at some degradation in performance. In
other words easy to maintain.
I suppose I ought to put up or shut up, so I'll suffer the consequences of
you pointing out errors or limitations and post below.
I'm very pleased you posted your Byte function, can only wonder if the OP
is!
Regards,
Peter T
If viewing in Googal proportional font, suggest switching to fixed font, or
Options > show original.
Public Declare Function GetTickCount Lib "kernel32.dll" () As Long
Sub Test_SentCap()
Dim r As Long, c As Long
Dim rng As Range
Dim vCells
Dim t1 As Long, t2 As Long
Dim str As String, i As Byte
Dim aTxt(1 To 12) 'As String
aTxt(1) = "this is the 1ST sentence. second sentence. This " _
& "requires no caps.a sentence with no leading spaces." _
& " sentence with 10 leading spaces"
aTxt(2) = "hello? erm, i'M only testing, eh. " _
& "indeed, i am inquisitve."
aTxt(3) = "how old?! 22 Years."
aTxt(4) = "how old?! twenty-two Years."
aTxt(5) = "hmmmm.... wOrking?!?! sam i am. yes-no? isn't i'm isn't."
aTxt(6) = "THE DAY WAS SUNNY AND I WORE A HAT.PETER WAS THERE"
aTxt(7) = "no WorRies, i'm ONLY testIng! yes-no?"
aTxt(8) = "mY fRiend & i"
aTxt(9) = "iiiiiiiiiiiiii"
aTxt(10) = "****T. toast %T i @"
aTxt(11) = "re: sentences." 'doesn't yet correct re: s > Re: S
aTxt(12) = "HI MR OLIVER. IT'S I, PETER"
Set rng = Range("C1:h200") '1200 cells
For i = 1 To 12
rng.Clear
rng.Value = aTxt(i)
'Stop ' and have a look?
t1 = GetTickCount
vCells = rng.Value
t2 = GetTickCount
For r = 1 To UBound(vCells)
For c = 1 To UBound(vCells, 2)
SentCap vCells(r, c) ', True
Next
Next
t2 = GetTickCount - t2
rng.Value = vCells
t1 = GetTickCount - t1
'Stop ' and have a look?
Debug.Print aTxt(i)
Debug.Print vCells(1, 1)
Debug.Print "Overall " & t1 / 1000, , "Function " & t2 / 1000
Debug.Print
Next
Set rng = Nothing: Erase vCells
End Sub
Function SentCap(vs) As Boolean
Dim b As Boolean
Dim i As Long, k As Long, naCnt As Byte
Dim n As Long, t As Long
Dim nLen As Long
Dim aCorrect() As String
Const cSPACE As String = " "
Const cDOT As String = "."
'A few simple corrections
naCnt = 8 ' count of corrections
ReDim aCorrect(1 To naCnt, 0 To 1) As String
'could maintain this array as Static & erase when
'done, if called as a big loop.
'or pass as an argument for special purposes
aCorrect(1, 0) = " i ": aCorrect(1, 1) = " I "
aCorrect(2, 0) = " i.": aCorrect(2, 1) = " I."
aCorrect(3, 0) = " i,": aCorrect(3, 1) = " I,"
aCorrect(4, 0) = " i'": aCorrect(4, 1) = " I'"
aCorrect(5, 0) = "mr": aCorrect(5, 1) = "Mr"
aCorrect(6, 0) = "mrs": aCorrect(6, 1) = "Mrs"
aCorrect(7, 0) = "oliver": aCorrect(7, 1) = "Oliver"
aCorrect(8, 0) = "peter": aCorrect(8, 1) = "Peter"
' careful of names/acronyms that could be suffix or prefix
' or common words, eg "miss", "nate"
vs = CStr(vs)
nLen = Len(vs)
n = 1: t = 1
ReDim va(0) 'array for each sentance
'split each sentance ending in DOT
' (probably easier to use vba's Split function)
Do While n > 0
n = InStr(n, vs, cDOT)
If n Then
'include any more spaces after the DOT
b = True
Do While (b)
b = Mid$(vs, n + 1, 1) = cSPACE
If b Then
n = n + 1
End If
Loop
If cnt Then ReDim Preserve va(cnt)
va(cnt) = Mid(vs, t, n - t + 1)
cnt = cnt + 1
n = n + 1
t = n
End If
Loop
If t - 1 < nLen Then
'in case last sentance doesn't end with a DOT
If cnt Then ReDim Preserve va(cnt + 1)
va(cnt) = Mid$(vs, t, nLen - t + 1)
End If
For i = 0 To UBound(va)
'process each sentance, starting with initial capital
va(i) = UCase$(Left$(va(i), 1)) & LCase$(Mid$(va(i), 2, Len(va(i))))
'now the corrections
For k = 1 To naCnt
If InStr(1, va(i), aCorrect(k, 0)) Then
#If VBA6 Then
va(i) = Replace(va(i), aCorrect(k, 0), aCorrect(k, 1))
#Else
'for Excel 97
va(i) = Application.Substitute(va(i), " i ", " I ")
#End If
End If
Next
'any more corrections, eg trailing "I" without punctuation
If Right$(va(i), 2) = " i" Then
va(i) = Left$(va(i), Len(va(i)) - 2) & " I"
End If
Next
're-join the sentances
'this loop seems as fast as "vs = Join(va)" & OK for Excel 97
For i = 1 To UBound(va)
va(0) = va(0) & va(i)
Next
vs = va(0)
End Function