Parse String

  • Thread starter Thread starter Geoff Murley
  • Start date Start date
G

Geoff Murley

I have a column containing Strings such as:

"10 new street, Wales f450 myhouse abc"

in each row.

I would like some code to extract the number after the 'f'
character so that in this example '450' is written to the
cell in the adjacent column. So if the above string was
in cell B1 , '450' would be written to C1 and so on for
all the entries in column B. Assume that there will only
be one 'f' entry in each string.

Can anyone help please?
 
As long as there is a comma preceding bit, just one, and a space before the
f, then

=MID(B1,FIND(" f",B1,FIND(",",B1))+1,99)

--

HTH

RP
(remove nothere from the email address if mailing direct)
 
Thanks Bob,

However it doesn't quite work. With my example I get
f450 my house abc

all I want is the '450' extracted.

Also there may not be a space or a comma before the 'f'.
So I could have myhouse(f450) fgh. I would only want the
450 in the next cell. The only thing I can assume is that
the letter 'f' will be followed by a number. So would a
bit of VBA be needed to check for a 'f' immediately
followed by a number?
 
Sorry, forgot to finish it. Another shot

=MID(B1,FIND(" f",B1,FIND(",",B1))+2,FIND(" ",B1,FIND("
f",B1,FIND(",",B1))+1)-FIND(" f",B1,FIND(",",B1))-2)

--

HTH

RP
(remove nothere from the email address if mailing direct)
 
This formula returns 450 given the text you specified.

=MID(A1,FIND("f",A1)+1,FIND(" ",A1,FIND("f",A1))-1-FIND("f",A1))

The following VBA function will handle cases where there is more than one "f"
in the text. It finds the first f that is followed by a digit. Then it takes
all of the characters after that f that can be interpreted as a number, i.e.
it accepts a decimal point, comma, and minus sign as well as digits 0-9.

With an input of "myrna larson fabc def adc f1,280.48 zwz", the result is
1280.48


Option Explicit

Function GetNumber(sText As String) As Variant
Dim f As Long
Dim n As Long
Dim s As Long

GetNumber = CVErr(xlErrValue)
f = 0
Do
f = InStr(f + 1, sText, "f")
If f = 0 Then Exit Do

If Mid$(sText, f, 2) Like "f#" Then
f = f + 1
For s = f + 1 To Len(sText)
Select Case Asc(Mid$(sText, s, 1))
Case 44 To 46, 48 To 57 ' , - . 0-9

Case Else
Exit For
End Select
Next s
GetNumber = CDbl(Mid$(sText, f, s - f))
Exit Do
End If
Loop

End Function
 
Myrna Larson wrote...
This formula returns 450 given the text you specified.

=MID(A1,FIND("f",A1)+1,FIND(" ",A1,FIND("f",A1))-1-FIND("f",A1))

Chokes if 'f#..#' appears at the end of A1.

The general case in which there could be any number of 'f' substrings
before the leftmost 'f' immediately followed by a numeral and the
'f#..#' substring could be delimited by any other characters as well as
appearing at the end of the string would be the array formula

=MID(A1,MIN(FIND("f"&{0;1;2;3;4;5;6;7;8;9},A1&"f0f1f2f3f4f5f6f7f8f9f0"))+1,
MATCH(0,-ISNUMBER(-MID(A1&"
",ROW(INDIRECT((MIN(FIND("f"&{0;1;2;3;4;5;6;7;8;9},
A1&"f0f1f2f3f4f5f6f7f8f9f0"))+2)&":"&(LEN(A1)+1))),1)),0))

The following VBA function will handle cases where there is more than one "f"
in the text. It finds the first f that is followed by a digit. Then it takes
all of the characters after that f that can be interpreted as a number, i.e.
it accepts a decimal point, comma, and minus sign as well as digits
0-9.

Then it'll return '.-.' for "123 foobar lane -.f.-.f.- !! f10"
With an input of "myrna larson fabc def adc f1,280.48 zwz", the result is
1280.48


Option Explicit

Function GetNumber(sText As String) As Variant
Dim f As Long
Dim n As Long
Dim s As Long

GetNumber = CVErr(xlErrValue)
f = 0
Do
f = InStr(f + 1, sText, "f")
If f = 0 Then Exit Do

If Mid$(sText, f, 2) Like "f#" Then
f = f + 1
For s = f + 1 To Len(sText)
Select Case Asc(Mid$(sText, s, 1))
Case 44 To 46, 48 To 57 ' , - . 0-9

Case Else
Exit For
End Select
Next s
GetNumber = CDbl(Mid$(sText, f, s - f))
Exit Do
End If
Loop

End Function
....

If you're going to resort to VBA, why not make it general? For string
parsing nothing beats regular expressions. Using the Subst function in

http://groups-beta.google.com/group/microsoft.public.excel.worksheet.functions/msg/d2d252b4201d9d22

(or http://makeashorterlink.com/?L2BA2136A ). Then use the worksheet
formula

=subst(A1,".*?f(-?\d*\.?\d+).*","$1")

which will only pull true numeric substrings following 'f' rather than
any stray sequences of hyphens and periods that happen to follow an
'f'. For instance,

=subst("123 foo foo-foo f--2--f f-.547abcdef
f0",".*?f(-?\d*\.?\d+).*","$1")

returns -.547. Try, just try, to handle this kind of string with
reasonable numeric substring semantics without regular expressions. Of
course it can be done, but it requires a state machine.
 
True. Given his example (an address?), I inferred it would be followed by a
space and would not occur at the end of the string.

If it is normally followed by a space, this modification, that ensures there
is a space at the end of the text, would handle that variation.

=MID(A1,FIND("f",A1)+1,FIND(" ",A1&" ",FIND("f",A1))-1-FIND("f",A1))

But it assumes his statement that there is only 1 "f" is correct.
 
I just re-read his original message, where he gives the example (f450). The
formula won't handle that -- it requires a space at the end of the number.

I think the macro handles all possibilities, however.
 
Myrna Larson wrote...
I just re-read his original message, where he gives the example (f450). The
formula won't handle that -- it requires a space at the end of the
number.

*Your* formula. Mine handles anything matching the regexp 'f\d+'.

The point is that numeric substrings are well defined - they end with
the rightmost numeral. Whatever follows, if anything, is irrelevant.
Only the transition matters. Therefore, it's sufficient to find all
numerals following the 'f' rather than finding anything in particular
after the rightmost numeral.
I think the macro handles all possibilities, however.

Your udf does check for 'f' followed by a numeral, so I was wrong about
it picking up substrings beginning with 'f' followed by hyphens, commas
or periods, but it does include trailing punctuation, so it could pick
up trailing hyphens, commans and periods that would prevent the
substring from being converted to a number. For example, in both

123 My Street, Wherever f999.. whatever

123 My Street, Wherever .f9.9. whatever

your udf chokes on the CDbl call. If you want to avoid regular
expressions, you could build the state machine into your subsequent
character test.


Function gnaf(s As String) As Variant
Dim p As Long, q As Long, vc As String, c As String * 1

gnaf = CVErr(xlErrValue)
vc = ".0123456789,-"

Do
p = InStr(p + 1, s, "f")
If p = 0 Then Exit Do

If Mid(s, p, 2) Like "f#" Then
p = p + 1
q = p + 1

Do While q <= Len(s)
c = Mid(s, q, 1)

If InStr(1, vc, c) = 0 Then Exit Do 'inner Do

q = q + 1

If c = "-" Then
Exit Do 'inner Do

ElseIf c = "." Then
vc = Mid(vc, 2)

End If

Loop

gnaf = CDbl(Mid(s, p, q - p))
Exit Function

End If

Loop

End Function
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Back
Top