finding numbers within text

J

jasondebolt

I'm trying to extract a 15 digit number from a long text string within
cells.

Many of the cells in my spreadsheet have lots of text, with sometimes
200-300 words per cell. Each of these cells has one15 digit number
that I need to extract and separate from the text. This 15 digit
number will then be pasted into a different cell.

Normally, I would do a search for a 6 digit number that begins with 1
within a cell like this:

"This is my text 123456 hello everyone" --> text in cell A1

=MID(A1,FIND("1",A1,1),6) --> formula in cell A2

123456 --> result in cell A2


However, I'm looking for a 15 digit number that doesn't start or end
with the same numbers; it's a completely random number. There is
nothing before or after the numbers to reference, such as "order #" or
"# of days."I don't think there are any wildcard characters in excel
exclusivley for numbers. Anyone know how to create a formula for this?

Thanks,
Jason
 
G

Gord Dibben

Jason

You could use this Function on a copy of the cell(s) with numbers/text.

Function DeleteNonNumerics(ByVal sStr As String) As Long
Dim i As Long
If sStr Like "*[0-9]*" Then
For i = 1 To Len(sStr)
If Mid(sStr, i, 1) Like "[0-9]" Then
DeleteNonNumerics = DeleteNonNumerics & Mid(sStr, i, 1)
End If
Next i
Else
DeleteNonNumerics = sStr
End If
End Function

Usage is: =DeleteNonNumerics(cellref)


Gord Dibben MS Excel MVP
 
J

jasondebolt

Thanks for the quick reply, you really are an Excel MVP :)

However, it's still not working. The function works for some numbers,
but not for the specific 15 digit long numbers that I'm looking for.
Do you know why the function recognizes some numbers within text
strings, but not all of them? Also, is there a way to limit the
numbers that the function spits out to be exactly 15 digits long? The
output number has to be exactly 15 digits long, no more or less.

Thanks!
Jason




Jason

You could use this Function on a copy of the cell(s) with numbers/text.

Function DeleteNonNumerics(ByVal sStr As String) As Long
Dim i As Long
If sStr Like "*[0-9]*" Then
For i = 1 To Len(sStr)
If Mid(sStr, i, 1) Like "[0-9]" Then
DeleteNonNumerics = DeleteNonNumerics & Mid(sStr, i, 1)
End If
Next i
Else
DeleteNonNumerics = sStr
End If
End Function

Usage is: =DeleteNonNumerics(cellref)

Gord Dibben MS Excel MVP

I'm trying to extract a 15 digit number from a long text string within
cells.
Many of the cells in my spreadsheet have lots of text, with sometimes
200-300 words per cell. Each of these cells has one15 digit number
that I need to extract and separate from the text. This 15 digit
number will then be pasted into a different cell.
Normally, I would do a search for a 6 digit number that begins with 1
within a cell like this:


"This is my text 123456 hello everyone" --> text in cell A1
=MID(A1,FIND("1",A1,1),6) --> formula in cell A2
123456 --> result in cell A2
However, I'm looking for a 15 digit number that doesn't start or end
with the same numbers; it's a completely random number. There is
nothing before or after the numbers to reference, such as "order #" or
"# of days."I don't think there are any wildcard characters in excel
exclusivley for numbers. Anyone know how to create a formula for this?
Thanks,
Jason
 
R

Ron Rosenfeld

I'm trying to extract a 15 digit number from a long text string within
cells.

Many of the cells in my spreadsheet have lots of text, with sometimes
200-300 words per cell. Each of these cells has one15 digit number
that I need to extract and separate from the text. This 15 digit
number will then be pasted into a different cell.

Normally, I would do a search for a 6 digit number that begins with 1
within a cell like this:

"This is my text 123456 hello everyone" --> text in cell A1

=MID(A1,FIND("1",A1,1),6) --> formula in cell A2

123456 --> result in cell A2


However, I'm looking for a 15 digit number that doesn't start or end
with the same numbers; it's a completely random number. There is
nothing before or after the numbers to reference, such as "order #" or
"# of days."I don't think there are any wildcard characters in excel
exclusivley for numbers. Anyone know how to create a formula for this?

Thanks,
Jason

You could use this UDF which makes use of Regular Expressions.

<alt-F11> opens the VB Editor.

Ensure your project is highlighted in the Project Explorer window, then
Insert/Module and paste the code below into the window that opens.

You must also set a reference to Microsoft VBScript Regular Expressions 5.5
(Under Tools/References).

=====================================================
Option Explicit

Function Get15(str As String) As String
Dim objRegExp As RegExp
Dim objMatch As Match
Dim colMatches As MatchCollection
Const Pattern As String = "(^|\D)(\d{15})(\D|$)"

' Create a regular expression object.
Set objRegExp = New RegExp

'Set the pattern by using the Pattern property.
objRegExp.Pattern = Pattern

'Set global applicability.
objRegExp.Global = True

'Test whether the String can be compared.
If (objRegExp.Test(str) = True) Then
Set colMatches = objRegExp.Execute(str)
Set objMatch = colMatches.Item(0)
Get15 = objMatch.SubMatches(1)
End If
End Function
====================================================

"Pattern" sets the specifications. The Pattern that needs to be matched
consists of:

1. Either a non-Digit character or the start of the string
followed by
2. any 15 consecutive digits
followed by
3. a non-Digit character or the end of the string

To use this, enter the UDF =Get15(str) where str is either a string or a cell
reference containing the string.
--ron
 
T

T. Valko

Do any numbers have leading 0's:

012345678901234

This formula will extract the number string but will drop any leading 0's:

=LOOKUP(99^99,--MID(A1,MIN(FIND({0,1,2,3,4,5,6,7,8,9},A1&"0123456789")),ROW(INDIRECT("1:"&LEN(A1)))))

Excel insists on formatting the result as SCIENTIFIC so you'll want to
format as NUMBER 0 decimal places. Or, if you do have leading 0's you can
use a custom format of: 000000000000000

Biff
 
M

Miyahn

You must also set a reference to Microsoft VBScript Regular Expressions 5.5

Here is a late binding version.
(This can use without reference setting.)

Option Explicit
Function GetNumber(Src As String, Optional Digit As Long = 0) As String
Application.Volatile
With CreateObject("VBScript.RegExp")
Select Case Digit
Case 0: .Pattern = "\d+"
Case Else: .Pattern = "(?:^|\D)(\d{" & CStr(Digit) & "})(?=\D|$)"
End Select
If .Test(Src) Then
GetNumber = .Execute(Src)(0).Submatches(0)
End If
End With
End Function
 
B

Bob Phillips

If you are happy with the result being a string, you can use

=MID(A1,MIN(SEARCH({0,1,2,3,4,5,6,7,8,9},A1&"0123456789")),SUMPRODUCT(LEN(A1)-LEN(SUBSTITUTE(A1,{0,1,2,3,4,5,6,7,8,9},""))))

--
---
HTH

Bob

(there's no email, no snail mail, but somewhere should be gmail in my addy)



Thanks for the quick reply, you really are an Excel MVP :)

However, it's still not working. The function works for some numbers,
but not for the specific 15 digit long numbers that I'm looking for.
Do you know why the function recognizes some numbers within text
strings, but not all of them? Also, is there a way to limit the
numbers that the function spits out to be exactly 15 digits long? The
output number has to be exactly 15 digits long, no more or less.

Thanks!
Jason




Jason

You could use this Function on a copy of the cell(s) with numbers/text.

Function DeleteNonNumerics(ByVal sStr As String) As Long
Dim i As Long
If sStr Like "*[0-9]*" Then
For i = 1 To Len(sStr)
If Mid(sStr, i, 1) Like "[0-9]" Then
DeleteNonNumerics = DeleteNonNumerics & Mid(sStr, i, 1)
End If
Next i
Else
DeleteNonNumerics = sStr
End If
End Function

Usage is: =DeleteNonNumerics(cellref)

Gord Dibben MS Excel MVP

I'm trying to extract a 15 digit number from a long text string within
cells.
Many of the cells in my spreadsheet have lots of text, with sometimes
200-300 words per cell. Each of these cells has one15 digit number
that I need to extract and separate from the text. This 15 digit
number will then be pasted into a different cell.
Normally, I would do a search for a 6 digit number that begins with 1
within a cell like this:


"This is my text 123456 hello everyone" --> text in cell A1
=MID(A1,FIND("1",A1,1),6) --> formula in cell A2
123456 --> result in cell A2
However, I'm looking for a 15 digit number that doesn't start or end
with the same numbers; it's a completely random number. There is
nothing before or after the numbers to reference, such as "order #" or
"# of days."I don't think there are any wildcard characters in excel
exclusivley for numbers. Anyone know how to create a formula for this?
Thanks,
Jason
 
R

Ron Rosenfeld

in message news:[email protected]

Here is a late binding version.
(This can use without reference setting.)

Option Explicit
Function GetNumber(Src As String, Optional Digit As Long = 0) As String
Application.Volatile
With CreateObject("VBScript.RegExp")
Select Case Digit
Case 0: .Pattern = "\d+"
Case Else: .Pattern = "(?:^|\D)(\d{" & CStr(Digit) & "})(?=\D|$)"
End Select
If .Test(Src) Then
GetNumber = .Execute(Src)(0).Submatches(0)
End If
End With
End Function

If there are many fields to process, won't late binding make for a longer
execution time?
--ron
 
R

Ron Rosenfeld

If you are happy with the result being a string, you can use

=MID(A1,MIN(SEARCH({0,1,2,3,4,5,6,7,8,9},A1&"0123456789")),SUMPRODUCT(LEN(A1)-LEN(SUBSTITUTE(A1,{0,1,2,3,4,5,6,7,8,9},""))))

I don't know if it makes a difference to the OP, but your formula does not
differentiate between numbers that contain 15 digits, and those that contain
fewer or more.
--ron
 
B

Bob Phillips

Hi Ron,

I wasn't sure from his post whether he only wanted 15 digit numbers amongst
others, or they would all be 15 digits. From his post I took it that he
wanted all numbers, he just couldn't use his formula because they didn't all
start with 1.

--
---
HTH

Bob

(there's no email, no snail mail, but somewhere should be gmail in my addy)
 
B

Bernd

Hi Jason,

A suggestion:
Function XtractNum(s As String) As Double
Dim i As Long, d As Double, c As String, f As Double

i = 0: d = 0#: f = 1#

StateStart:
i = i + 1
If i > Len(s) Then GoTo StateEnd
c = Mid(s, i, 1)
Select Case c
Case "0" To "9"
d = CDbl(c)
GoTo StatePreComma
Case "."
d = 0#
GoTo StatePostComma
Case Else
GoTo StateStart
End Select

StatePreComma:
i = i + 1
If i > Len(s) Then GoTo StateEnd
c = Mid(s, i, 1)
Select Case c
Case "0" To "9"
d = 10# * d + CDbl(c)
GoTo StatePreComma
Case "."
GoTo StatePostComma
Case Else
GoTo StateEnd
End Select

StatePostComma:
i = i + 1
If i > Len(s) Then GoTo StateEnd
c = Mid(s, i, 1)
Select Case c
Case "0" To "9"
f = f / 10#
d = d + CDbl(c) * f
GoTo StatePostComma
Case Else
GoTo StateEnd
End Select

StateEnd:
XtractNum = d

'Small specialty: Accept only 15 digit numbers:
If Len(CStr(d)) <> 15 Then
XtractNum = CVErr(xlErrNum)
End If

End Function

Please notice that leading zeros will be swallowed and that the
decimal point will be counted as a digit, too. Change the "small
specialty" at the end of my UDF if necessary.

Regards,
Bernd
 
R

Ron Rosenfeld

Hi Ron,

I wasn't sure from his post whether he only wanted 15 digit numbers amongst
others, or they would all be 15 digits. From his post I took it that he
wanted all numbers, he just couldn't use his formula because they didn't all
start with 1.

Yes, it was a bit hard to tell.

I assumed that he just wanted a 15 digit number, and that there would be only
one 15 digit number in the string. So my regular expression solution only
returns numbers that are exactly 15 digits in length. And by definition, there
would have to be a "non-number" before and after.


--ron
 
J

jasondebolt

in messagenews:[email protected]

Here is a late binding version.
(This can use without reference setting.)

Option Explicit
Function GetNumber(Src As String, Optional Digit As Long = 0) As String
Application.Volatile
With CreateObject("VBScript.RegExp")
Select Case Digit
Case 0: .Pattern = "\d+"
Case Else: .Pattern = "(?:^|\D)(\d{" & CStr(Digit) & "})(?=\D|$)"
End Select
If .Test(Src) Then
GetNumber = .Execute(Src)(0).Submatches(0)
End If
End With
End Function


It works! Thank you :)
 
J

jasondebolt

in messagenews:[email protected]

Here is a late binding version.
(This can use without reference setting.)

Option Explicit
Function GetNumber(Src As String, Optional Digit As Long = 0) As String
Application.Volatile
With CreateObject("VBScript.RegExp")
Select Case Digit
Case 0: .Pattern = "\d+"
Case Else: .Pattern = "(?:^|\D)(\d{" & CStr(Digit) & "})(?=\D|$)"
End Select
If .Test(Src) Then
GetNumber = .Execute(Src)(0).Submatches(0)
End If
End With
End Function


Thanks!!
 
M

Miyahn

If there are many fields to process, won't late binding make for a longer
execution time?

Yes.
In my environments, the late binding version takes a little longer time
(+1.2%) to execute.

Option Explicit
Function GetNumberLB(Src As String, Optional Digit As Long = 0) As String
Application.Volatile
With CreateObject("VBScript.RegExp")
Select Case Digit
Case 0: .Pattern = "\d+"
Case Else: .Pattern = "(?:^|\D)(\d{" & CStr(Digit) & "})(?=\D|$)"
End Select
If .Test(Src) Then
GetNumberLB = .Execute(Src)(0).Submatches(0)
End If
End With
End Function
'
Function GetNumberEB(Src As String, Optional Digit As Long = 0) As String
Application.Volatile
With New RegExp
Select Case Digit
Case 0: .Pattern = "\d+"
Case Else: .Pattern = "(?:^|\D)(\d{" & CStr(Digit) & "})(?=\D|$)"
End Select
If .Test(Src) Then
GetNumberEB = .Execute(Src)(0).Submatches(0)
End If
End With
End Function
'

Sub PerformanceTest()
Const DummySrc = "ABCD123456EFG"
Dim S As Single, I As Long, DummyResult As String
S = Timer
For I = 1 To 10000
DummyResult = GetNumberLB(DummySrc, 6)
Next I
Debug.Print "Late Binding: " & Timer - S
S = Timer
For I = 1 To 10000
DummyResult = GetNumberEB(DummySrc, 6)
Next I
Debug.Print "Early Binding: " & Timer - S
End Sub
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top