Removing various combinations of text and returning only a number inexcel

W

Waheed Ajouhaar

A very good day to all,
I am looking for a formula that would remove all non-numeric data from cells and return only a number. PS- notice that there is one numeric character that also needs to be removed - the number after the "m". also note that the formula must be able to handle each variation of text below.

Data:

A1 = "Gaz = 3 421 786 m3"
A2 = "Gaz=3 421 786 m4"
A3 = "Gaz=3421786m5"
A4 = "Gaz = 3,421,786 m6"
A5 = "Gaz=3,421,786m7"
A6 = "Gaz =3 421 786 m3"

Result:
The resultant for each line should be "3421786"

Looking forward to see the result.
Thanks :)


Submitted via EggHeadCafe - Software Developer Portal of Choice
Dynamic ASP.NET charts with Office Web Components
http://www.eggheadcafe.com/tutorial...f5-b3de724b556f/dynamic-aspnet-charts-wi.aspx
 
R

Ron Rosenfeld

A very good day to all,
I am looking for a formula that would remove all non-numeric data from cells and return only a number. PS- notice that there is one numeric character that also needs to be removed - the number after the "m". also note that the formula must be able to handle each variation of text below.

Data:

A1 = "Gaz = 3 421 786 m3"
A2 = "Gaz=3 421 786 m4"
A3 = "Gaz=3421786m5"
A4 = "Gaz = 3,421,786 m6"
A5 = "Gaz=3,421,786m7"
A6 = "Gaz =3 421 786 m3"

Result:
The resultant for each line should be "3421786"

Looking forward to see the result.
Thanks :)


Submitted via EggHeadCafe - Software Developer Portal of Choice
Dynamic ASP.NET charts with Office Web Components
http://www.eggheadcafe.com/tutorial...f5-b3de724b556f/dynamic-aspnet-charts-wi.aspx


This can be done simply with a UDF.

To enter this User Defined Function (UDF), <alt-F11> opens the Visual Basic
Editor.
Ensure your project is highlighted in the Project Explorer window.
Then, from the top menu, select Insert/Module and
paste the code below into the window that opens.

To use this User Defined Function (UDF), enter a formula like

=RemSpec(A1)

in some cell.

=========================================
Option Explicit
Function RemSpec(s As String) As String
Dim re As Object
Set re = CreateObject("vbscript.regexp")
re.Global = True
re.Pattern = "m\d|\D"
RemSpec = re.Replace(s, "")
End Function
==========================================


--ron
 
R

Ron Rosenfeld

A very good day to all,
I am looking for a formula that would remove all non-numeric data from cells and return only a number. PS- notice that there is one numeric character that also needs to be removed - the number after the "m". also note that the formula must be able to handle each variation of text below.

Data:

A1 = "Gaz = 3 421 786 m3"
A2 = "Gaz=3 421 786 m4"
A3 = "Gaz=3421786m5"
A4 = "Gaz = 3,421,786 m6"
A5 = "Gaz=3,421,786m7"
A6 = "Gaz =3 421 786 m3"

Result:
The resultant for each line should be "3421786"

Looking forward to see the result.
Thanks :)


Submitted via EggHeadCafe - Software Developer Portal of Choice
Dynamic ASP.NET charts with Office Web Components
http://www.eggheadcafe.com/tutorial...f5-b3de724b556f/dynamic-aspnet-charts-wi.aspx

If your strings are exactly like the above, where the number starts with the
first non-space character after the "=", and the first non-space character
after the number will always be an "m" (and that is the only "m" in the
string), then you could use this formula:

=--MID(SUBSTITUTE(LEFT(A1,SEARCH("m",A1)-1)," ",""),
FIND("=",SUBSTITUTE(LEFT(A1,SEARCH("m?",A1)-1)," ",""))+1,99)
--ron
 
W

Waheed Ajouhaar

Hi Ron,
The UDF works great thanks - wow, very little code to accomplish a task that seemed to be very tricky given the various data formats. Thank you very much - if not too much to ask, could you please elaborate on how the code accomplishes the task at hand? Is the --- re.Pattern = "m\d|\D" --- section the bit that actually does the clean up?
I am still learning to write code, don?t have much experience but very eager to learn more.
Many Thanks, your efforts are much appreciated!




Ron Rosenfeld wrote:

This can be done simply with a UDF.
15-Dec-09

This can be done simply with a UDF

To enter this User Defined Function (UDF), <alt-F11> opens the Visual Basi
Editor
Ensure your project is highlighted in the Project Explorer window
Then, from the top menu, select Insert/Module an
paste the code below into the window that opens

To use this User Defined Function (UDF), enter a formula lik

=RemSpec(A1

in some cell

========================================
Option Explici
Function RemSpec(s As String) As Strin
Dim re As Objec
Set re = CreateObject("vbscript.regexp"
re.Global = Tru
re.Pattern = "m\d|\D
RemSpec = re.Replace(s, ""
End Functio
=========================================

--ron

Previous Posts In This Thread:


Submitted via EggHeadCafe - Software Developer Portal of Choice
Web Services - SOAP / Multiple Recordsets
http://www.eggheadcafe.com/tutorial...e16-e93ab938dc36/web-services---soap--mu.aspx
 
W

Waheed Ajouhaar

Hi Ron,Your formuala also seems much simpler than mine.
I came up with the following formula before I saw your solution:

=(REPLACE(REPLACE(SUBSTITUTE(SUBSTITUTE(A1," ",""),",",""),SEARCH("Gaz=",SUBSTITUTE(SUBSTITUTE(A1," ",""),",","")),4,""),SEARCH("m3",REPLACE(SUBSTITUTE(SUBSTITUTE(A1," ",""),",",""),SEARCH("Gaz=",SUBSTITUTE(SUBSTITUTE(A1," ",""),",","")),4,"")),2,""))*1

Many thanks!
Waheed



Ron Rosenfeld wrote:

If your strings are exactly like the above, where the number starts with
15-Dec-09

If your strings are exactly like the above, where the number starts with th
first non-space character after the "=", and the first non-space characte
after the number will always be an "m" (and that is the only "m" in th
string), then you could use this formula

=--MID(SUBSTITUTE(LEFT(A1,SEARCH("m",A1)-1)," ","")
FIND("=",SUBSTITUTE(LEFT(A1,SEARCH("m?",A1)-1)," ",""))+1,99
--ron

Previous Posts In This Thread:


Submitted via EggHeadCafe - Software Developer Portal of Choice
XML/XPath Query On Attributes
http://www.eggheadcafe.com/tutorial...c6-580045c565bf/xmlxpath-query-on-attrib.aspx
 
R

Ron Rosenfeld

Hi Ron,
The UDF works great thanks - wow, very little code to accomplish a task that seemed to be very tricky given the various data formats. Thank you very much - if not too much to ask, could you please elaborate on how the code accomplishes the task at hand? Is the --- re.Pattern = "m\d|\D" --- section the bit that actually does the clean up?
I am still learning to write code, don?t have much experience but very eager to learn more.
Many Thanks, your efforts are much appreciated!


Waheed,

First of all, when you respond to a posting, see if you can figure out how to
keep your responses in the same thread (depends on the interface you are using
to access the newsgroup). I almost didn't see this because it has a different
Subject and it is not threaded with the original post.

So far as the UDF, it makes use of Regular Expressions, which is a powerful
tool for working with strings. It could have been done with "native" VBA
functions, but this was much quicker to develop.

The actual expression formal explanation:

m\d|\D

Options: case insensitive; ^ and $ match at line breaks

Match either the regular expression below (attempting the next alternative only
if this one fails) «m\d»
Match the character “m” literally «m»
Match a single digit 0..9 «\d»
Or match regular expression number 2 below (the entire match attempt fails if
this one fails to match) «\D»
Match a single character that is not a digit 0..9 «\D»

So first we look for an m followed by a single digit. Note that this will find
that pattern (m<digit>) anyplace in the string, so if it doesn't occur only at
the end, you could get unwanted results.

Next we look for any NON-digit.

Once we have collected those matches, we Replace them with nothing ("").

If you Google for Regular Expressions you will see a wealth of information on
how to set up the expression.

If you Google for Regular Expressions in VBA; and also look for that in the
MSDN knowledge base, you will get information as to their implementation in
VBA.
--ron
 
R

Ron Rosenfeld

Hi Ron,Your formuala also seems much simpler than mine.
I came up with the following formula before I saw your solution:

=(REPLACE(REPLACE(SUBSTITUTE(SUBSTITUTE(A1," ",""),",",""),SEARCH("Gaz=",SUBSTITUTE(SUBSTITUTE(A1," ",""),",","")),4,""),SEARCH("m3",REPLACE(SUBSTITUTE(SUBSTITUTE(A1," ",""),",",""),SEARCH("Gaz=",SUBSTITUTE(SUBSTITUTE(A1," ",""),",","")),4,"")),2,""))*1

Many thanks!
Waheed

That's because of the assumptions I made (see below). So I only needed to
Ron Rosenfeld wrote:

If your strings are exactly like the above, where the number starts with
15-Dec-09

If your strings are exactly like the above, where the number starts with the
first non-space character after the "=", and the first non-space character
after the number will always be an "m" (and that is the only "m" in the
string), then you could use this formula:

=--MID(SUBSTITUTE(LEFT(A1,SEARCH("m",A1)-1)," ",""),
FIND("=",SUBSTITUTE(LEFT(A1,SEARCH("m?",A1)-1)," ",""))+1,99)
--ron

Previous Posts In This Thread:


Submitted via EggHeadCafe - Software Developer Portal of Choice
XML/XPath Query On Attributes
http://www.eggheadcafe.com/tutorial...c6-580045c565bf/xmlxpath-query-on-attrib.aspx
--ron
 
Top