remove numbers

R

Ranjit kurian

I need a excel function to remove the numbers, decimals, coma, semicolon and
words like "LTD", "INC" from the cells.

The name in the cells are not fixed, it keep changes

example :
my cell ("A1") contain the below details
Bank of America LTD Inc, US1234567-001.

the excel function should give the below answer
Bank of America
 
M

Max

One lateral idea to approach this, which might work fine here

Assume you have a checklist of "standard" bank names in D1:D10, eg:
Bank of America
Bank of India
St George Bank
etc

Assume your embedded data viz.:
Bank of America LTD Inc, US1234567-001
is running in A1 down

Place in B1, normal ENTER:
=INDEX(D$1:D$10,MATCH(TRUE,INDEX(ISNUMBER(SEARCH($D$1:$D$10,A1)),),0))
Copy down. Col B will return the required "standard" bank names
corresponding to the names embedded in col A.
--
Max
Singapore
http://savefile.com/projects/236895
Downloads:20,500 Files:365 Subscribers:65
xdemechanik
---
 
R

Ron Rosenfeld

I need a excel function to remove the numbers, decimals, coma, semicolon and
words like "LTD", "INC" from the cells.

The name in the cells are not fixed, it keep changes

example :
my cell ("A1") contain the below details
Bank of America LTD Inc, US1234567-001.

the excel function should give the below answer
Bank of America

You can use a UDF (user defined function) but you need a more explicit rule.

But Excel does not have feature to determine "words like 'LTD', 'INC'. YOU
need to develop a list of words that you wish to test for.

In addition, you will need to develop an algorithm that can deal with company
names that include numbers.

This requires, at least in part, a better understanding of the variability in
your data.

For example;

1. Will there always be a final comma, after which everything can be removed?

2. Can numbers (and punctuation) prior to the comma be assumed to be part of
the company name?

--ron
 
G

Gary''s Student

Select the cells you want to process and run this simple macro:

Sub clean_it()
' gsnuxx
Dim sStr As String, i As Long, sStr1 As String
Dim sChar As String, rng As Range
For Each rng In Selection
sStr = rng.Value
For i = 1 To Len(sStr)
sChar = Mid(sStr, i, 1)
If sChar Like "[0-9]" Then
Else
If sChar = "." Or sChar = "," Or sChar = ";" Then
Else
sStr1 = sStr1 & sChar
End If
End If
Next
sStr1 = Replace(sStr1, "INC", "")
sStr1 = Replace(sStr1, "Inc", "")
sStr1 = Replace(sStr1, "LTD", "")
sStr1 = Replace(sStr1, "Ltd", "")
rng.Value = sStr1
Next
End Sub
 
R

Rick Rothstein

sStr1 = Replace(sStr1, "INC", "")
sStr1 = Replace(sStr1, "Inc", "")

The Replace function can be made case insensitive by using its optional
arguments. The functionality of the above two lines of code can be handled
by this single line of code...

sStr1 = Replace(sStr1, "Inc", "", , , vbTextCompare)

or, if you favor named arguments...

sStr1 = Replace(sStr1, "Inc", "", Compare:=vbTextCompare)
 
R

Ranjit kurian

1. Will there always be a final comma, after which everything can be removed?
2. Can numbers (and punctuation) prior to the comma be assumed to be part of
the company name?

answer: no, nothing is constant here, keep changes.

And if you think its not possibe to write any formula when the things are
not constant(fixed), then i will use your formula where i will have a 'final
comma' and punctuation are found as you said below.
 
R

Ranjit kurian

The below code works fine, but is it possible to remove the two words which
always begins with number

example:
the number always begins with some words like (US, CA, USB, USBW), so is it
possible to remove words which begins with word

Bank of America LTD Inc, US1234567-001.
answer expected to be like below
Bank of America


Gary''s Student said:
Select the cells you want to process and run this simple macro:

Sub clean_it()
' gsnuxx
Dim sStr As String, i As Long, sStr1 As String
Dim sChar As String, rng As Range
For Each rng In Selection
sStr = rng.Value
For i = 1 To Len(sStr)
sChar = Mid(sStr, i, 1)
If sChar Like "[0-9]" Then
Else
If sChar = "." Or sChar = "," Or sChar = ";" Then
Else
sStr1 = sStr1 & sChar
End If
End If
Next
sStr1 = Replace(sStr1, "INC", "")
sStr1 = Replace(sStr1, "Inc", "")
sStr1 = Replace(sStr1, "LTD", "")
sStr1 = Replace(sStr1, "Ltd", "")
rng.Value = sStr1
Next
End Sub
--
Gary''s Student - gsnu200816


Ranjit kurian said:
I need a excel function to remove the numbers, decimals, coma, semicolon and
words like "LTD", "INC" from the cells.

The name in the cells are not fixed, it keep changes

example :
my cell ("A1") contain the below details
Bank of America LTD Inc, US1234567-001.

the excel function should give the below answer
Bank of America
 
R

Rick Rothstein

Are you saying that the final item (the US1234567-001 in your example) is
not always separated from the rest of your text by a comma? What about
things like LTD, Inc, etc... will a company name always have something like
one of those abbreviations after it, or could a company name be shown
without them?

Maybe Max's idea is the way to go. Do you have (or can you get) a full list
of companies that could be located at the beginning of your text?
 
R

Rick Rothstein

Here is perhaps another one. You had this in your code...

If sChar Like "[0-9]" Then
Else
If sChar = "." Or sChar = "," Or sChar = ";" Then
Else
sStr1 = sStr1 & sChar
End If
End If

It looks like this is ignoring digits, dots, commas and semi-colons. You
could combine the punctuation test with the digit test like this...

If sChar Like "[0-9.,;]" Then
Else
sStr1 = sStr1 & sChar
End If

But here is what I think you will consider the pearl... if you prefix the
characters inside the square brackets with an exclamation sign (!), then the
Like operator will test sChar for any character **not** equal to the rest of
the characters in the list. So, your original If-Then structure can be
reduced to this...

If sChar Like "[!0-9.,;] Then sStr1 = sStr1 & sChar
 
R

Ron Rosenfeld

the company name?

answer: no, nothing is constant here, keep changes.

And if you think its not possibe to write any formula when the things are
not constant(fixed), then i will use your formula where i will have a 'final
comma' and punctuation are found as you said below.

Using the "rules" I mentioned, you could use this UDF.

To enter the UDF <alt-F11> opens the VB Editor. Ensure your project is
selected in the project explorer window, then Insert/Module and paste the code
below into the window that opens.

The third line of the code is a pipe-delimited list of the various suffixes to
the company name that you wish to remove. You will have to add on whatever
else there is besides LTD and INC

To use this, enter a formula of the type

=CorpName(cell_ref)

where cell_ref is the address containing the name to be processed.

============================
Option Explicit
Function CorpName(str As String) As String
Const sSuffix = "LTD|INC"
Dim re As Object
Set re = CreateObject("vbscript.regexp")
re.IgnoreCase = True
re.Global = True
re.Pattern = "(^[\s\S]+?)\s+\b(" & _
sSuffix & ")\b[^,]*,?[^,]*$"
CorpName = re.Replace(str, "$1")
End Function
=================================
--ron
 
G

Gary''s Student

I am, once again, in your debt.
--
Gary''s Student - gsnu200816


Rick Rothstein said:
Here is perhaps another one. You had this in your code...

If sChar Like "[0-9]" Then
Else
If sChar = "." Or sChar = "," Or sChar = ";" Then
Else
sStr1 = sStr1 & sChar
End If
End If

It looks like this is ignoring digits, dots, commas and semi-colons. You
could combine the punctuation test with the digit test like this...

If sChar Like "[0-9.,;]" Then
Else
sStr1 = sStr1 & sChar
End If

But here is what I think you will consider the pearl... if you prefix the
characters inside the square brackets with an exclamation sign (!), then the
Like operator will test sChar for any character **not** equal to the rest of
the characters in the list. So, your original If-Then structure can be
reduced to this...

If sChar Like "[!0-9.,;] Then sStr1 = sStr1 & sChar
 
G

Gary''s Student

If the words appear frequestly, just create some new Replace statements for
them.
--
Gary''s Student - gsnu200816


Ranjit kurian said:
The below code works fine, but is it possible to remove the two words which
always begins with number

example:
the number always begins with some words like (US, CA, USB, USBW), so is it
possible to remove words which begins with word

Bank of America LTD Inc, US1234567-001.
answer expected to be like below
Bank of America


Gary''s Student said:
Select the cells you want to process and run this simple macro:

Sub clean_it()
' gsnuxx
Dim sStr As String, i As Long, sStr1 As String
Dim sChar As String, rng As Range
For Each rng In Selection
sStr = rng.Value
For i = 1 To Len(sStr)
sChar = Mid(sStr, i, 1)
If sChar Like "[0-9]" Then
Else
If sChar = "." Or sChar = "," Or sChar = ";" Then
Else
sStr1 = sStr1 & sChar
End If
End If
Next
sStr1 = Replace(sStr1, "INC", "")
sStr1 = Replace(sStr1, "Inc", "")
sStr1 = Replace(sStr1, "LTD", "")
sStr1 = Replace(sStr1, "Ltd", "")
rng.Value = sStr1
Next
End Sub
--
Gary''s Student - gsnu200816


Ranjit kurian said:
I need a excel function to remove the numbers, decimals, coma, semicolon and
words like "LTD", "INC" from the cells.

The name in the cells are not fixed, it keep changes

example :
my cell ("A1") contain the below details
Bank of America LTD Inc, US1234567-001.

the excel function should give the below answer
Bank of America
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top