Parse date from string

D

DavidB

Situation...
In one column of a workbook there is text that cell by cell may or may
not contain a date. I want to extract the sate (if one exists) and
store it in the next column.

Example...
Values in Column 1:
A1 = "It was on on 07/04/1776 and we won"
A2 = "1/1/2000 wasnt so bad after all."
A3 = "Sorry no date nere!"
A4 = "New name/address on 11/1/2007 (OH NO EXTRA /)"

Desired data in Column 2:
A2 = 07/04/1776
B2 = 1/1/2000
B3 = -NULL-
B4 = 11/1/2007

Any help would be greatly appreciated... Thank you.
 
B

Bernie Deitrick

David,

You could use a UDF - see the code below. Use it like

=GetDate(A1)

The first version will return a date, the second will return a string. Note that dates prior to
1900 will not work as dates, but they will work as strings. Also note that this will not find dates
like "March 11, 2003"

Also, the dates version will return March 8, 2007 from this string, not the correct date (and the
string version will return 3/8, not 4/5/6)

"I used my 3/8 inch socket set on 4/5/6."

HTH,
Bernie
MS Excel MVP

'Date Version
Function GetDate(strInput As String) As Variant
Dim myVals As Variant
Dim i As Integer
Dim myDate As Date

myVals = Split(strInput, " ")

On Error GoTo ErrHandler
For i = LBound(myVals) To UBound(myVals)
myDate = DateValue(myVals(i))
GetDate = DateValue(myVals(i))
Exit Function
TryDate:
Next i

GetDate = "-NULL-"
Exit Function
ErrHandler:
Resume TryDate
End Function

'StringVersion
Function GetDate(strInput As String) As String
Dim myVals As Variant
Dim i As Integer
Dim myDate As Date

myVals = Split(strInput, " ")

On Error GoTo ErrHandler
For i = LBound(myVals) To UBound(myVals)
myDate = DateValue(myVals(i))
GetDate = myVals(i)
Exit Function
TryDate:
Next i

GetDate = "-NULL-"
Exit Function
ErrHandler:
Resume TryDate
End Function
 
H

Harlan Grove

Bernie Deitrick said:
You could use a UDF - see the code below. Use it like ....
. . . note that this will not find dates like "March 11, 2003" ....
'Date Version
Function GetDate(strInput As String) As Variant
Dim myVals As Variant
Dim i As Integer
Dim myDate As Date

myVals = Split(strInput, " ")

Which means any date immediately followed by punctuation, e.g.,

On 7/25/2007, Bernie Dietrick posted a suboptimal udf.

will include the punctuation mark in the entry in myVals.
On Error GoTo ErrHandler

Unnecessary, and On Error Resume Next would have made more sense.
For i = LBound(myVals) To UBound(myVals)
myDate = DateValue(myVals(i))

This throws an error if the string isn't a date. BUT, if it IS a date,
then myDate contains that date . . .
GetDate = DateValue(myVals(i))

.. . . which means this second DateValue call is pointless. Eliminate
myDate and the first of these assignment statements
Exit Function
TryDate:
Next i

GetDate = "-NULL-"

Questionable return value. You could have returned an error value, or
"" or a negative number of large absolute value (so it wouldn't be
confused with a date in 1904 date system).
Exit Function
ErrHandler:
Resume TryDate
End Function
....

Spaghetti code at its, er, finest!

If this approach made sense (it doesn't), then at least use sensible
control flow.


Function foo(s As String) As Variant
Dim t As Variant

foo = "" 'return value for no date found

On Error Resume Next

For Each t In Split(s, " ")
foo = DateValue(t)
If Err.Number <> 0 Then Err.Clear Else Exit For
Next t

End Function


But even better would be using a more intelligent approach.


Function foo2(s As String) As Variant
'requires a VBA reference to
'Microsoft VBScript Regular Expressions 5.5

Const MONM As String = _
"Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec|" & _
"January|February|March|April|May|June|" & _
"July|August|September|October|November|December"

Dim re As New RegExp, mc As MatchCollection

With re
.Global = True
.IgnoreCase = True

.Pattern = "\b(\d{1,2}[-/]\d{1,2}[-/](\d{2}){1,2})|" & _
"(\d{4}-\d{2}-\d{2})|((" & MONM & ")\d{1,2}(,?\s*\d{4})?)|" & _
"(\d{1,2}[- ]*(" & MONM & ")([- ]*\d{2,4})?)\b"

Set mc = .Execute(s)
End With

foo2 = IIf(mc.Count > 0, CDate(mc.Item(0).Value), "")

End Function


This should find any substring Excel itself would consider a date.
 
R

Ron Rosenfeld

Bernie Deitrick said:
You could use a UDF - see the code below. Use it like ...
. . . note that this will not find dates like "March 11, 2003" ...
'Date Version
Function GetDate(strInput As String) As Variant
Dim myVals As Variant
Dim i As Integer
Dim myDate As Date

myVals = Split(strInput, " ")

Which means any date immediately followed by punctuation, e.g.,

On 7/25/2007, Bernie Dietrick posted a suboptimal udf.

will include the punctuation mark in the entry in myVals.
On Error GoTo ErrHandler

Unnecessary, and On Error Resume Next would have made more sense.
For i = LBound(myVals) To UBound(myVals)
myDate = DateValue(myVals(i))

This throws an error if the string isn't a date. BUT, if it IS a date,
then myDate contains that date . . .
GetDate = DateValue(myVals(i))

. . . which means this second DateValue call is pointless. Eliminate
myDate and the first of these assignment statements
Exit Function
TryDate:
Next i

GetDate = "-NULL-"

Questionable return value. You could have returned an error value, or
"" or a negative number of large absolute value (so it wouldn't be
confused with a date in 1904 date system).
Exit Function
ErrHandler:
Resume TryDate
End Function
...

Spaghetti code at its, er, finest!

If this approach made sense (it doesn't), then at least use sensible
control flow.


Function foo(s As String) As Variant
Dim t As Variant

foo = "" 'return value for no date found

On Error Resume Next

For Each t In Split(s, " ")
foo = DateValue(t)
If Err.Number <> 0 Then Err.Clear Else Exit For
Next t

End Function


But even better would be using a more intelligent approach.


Function foo2(s As String) As Variant
'requires a VBA reference to
'Microsoft VBScript Regular Expressions 5.5

Const MONM As String = _
"Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec|" & _
"January|February|March|April|May|June|" & _
"July|August|September|October|November|December"

Dim re As New RegExp, mc As MatchCollection

With re
.Global = True
.IgnoreCase = True

.Pattern = "\b(\d{1,2}[-/]\d{1,2}[-/](\d{2}){1,2})|" & _
"(\d{4}-\d{2}-\d{2})|((" & MONM & ")\d{1,2}(,?\s*\d{4})?)|" & _
"(\d{1,2}[- ]*(" & MONM & ")([- ]*\d{2,4})?)\b"

Set mc = .Execute(s)
End With

foo2 = IIf(mc.Count > 0, CDate(mc.Item(0).Value), "")

End Function


This should find any substring Excel itself would consider a date.

However, Harlan's code has a problem. It will return a #VALUE! error when
mc.Count = 0 (i.e. when no date is present). This is likely because, as
documented in HELP, IIf evaluates both the truepart and the falsepart.

mc.Item(0) is invalid when mc.count = 0

A better approach might be to use the Test method? of re, rather than the Count
method of mc.

Or to use the If function rather than IIf

e.g.

foo2 = ""
If mc.Count > 0 Then foo2 = CDate(mc.Item(0).Value)


--ron
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top