search for a specific word and copy the word and the preceeding words until a comma

D

DHANANJAY

Hello everybody,

I have addresses entered as in below in several rows. Note that the
entire address for each individual is put in a single cell as text. All
cells have commas though and "text to column" option is completely
ruled out. There are about 50000 such entries and they do not contain
words in fixed order. For instance, in the following example, say the
rows are in rows B1 to B1000, I wish to place a formula in C1 to C1000
and find if the corresponding cell in B1 to B1000 has the word "Taluk"
and if so, to also read the word or words before "Taluk" - all occuring
until a comma and then to copy the entire set of words in C1 to C1000.

Kempapura, Hasigala Post, Hoskote Taluk, Bangalore Rural
Upparahalli, Kumbalahalli Post, Hona Hoskote Taluk, Bangalore Rural
Dist. Pin. 562114
#42/10, Suresh Nilaya, Behind Dharmaraya Temple, Malinapet, Taluk,
Bangalore
J 36, Pipline Malleshwaram, Ganesha Block, Bangalore
Taluk,Viveknagar II stage, Sulibele Road, Bangalore Rural Dist


This is what I a single formula to accomplish :

In row 1 below, I need to get "Hoskote Taluk"
In row 2 below, I need to get "Hona Hoskote Taluk"
In row 3 below, I need to get "Taluk"
In row 4 below, I need to get nothing
In row 5 below, I need to get "Taluk"

Kempapura, Hasigala Post, Hoskote Taluk, Bangalore Rural
Upparahalli, Kumbalahalli Post, Hona Hoskote Taluk, Bangalore Rural
Dist. Pin. 562114
#42/10, Suresh Nilaya, Behind Dharmaraya Temple, Malinapet, Taluk,
Bangalore
J 36, Pipline Malleshwaram, Ganesha Block, Bangalore
Taluk,Viveknagar II stage, Sulibele Road, Bangalore Rural Dist


All support is appreciated.

DHANANJAY
 
D

DHANANJAY

I wish to clarify that in the example, the words seem to not be
contained within a cell. But all of my entries that I want to
manipulate are contained within a cell and in the above example the
data is B1 to B5
 
D

DHANANJAY

I wish to clarify that in the example, the words seem to not be
contained within a cell. But all of my entries that I want to
manipulate are contained within a cell and in the above example the
data is B1 to B5
 
K

kounoike

how about this one:

you could input a formula into c1 like
=mypickup(B1,"Taluk")
and copy c1 to any ranges you like.
but not recommend to use this function as UDF, because you have so many data.
i would rather run macro mytest.

Sub mytest()
Dim rng As Range
Dim i As Long

Application.ScreenUpdating = False
Set rng = Range("b1:b" & Cells(Cells.Rows.Count, "b").End(xlUp).Row)
i = 1
For Each s In rng
Cells(i, "c") = mypickup(s, "Taluk")
i = i + 1
Next
End Sub

Function mypickup(ByVal s, ByVal sw As String, Optional del As String = ",") As
String
Dim tmp
Dim swlen As Long, pos As Long

tmp = split(s, del)
swlen = Len(sw)
For i = LBound(tmp) To UBound(tmp)
pos = InStr(tmp(i), sw)
If pos > 0 Then
tmp(i) = Left(tmp(i), pos + swlen - 1)
If mypickup = "" Then
mypickup = Trim(tmp(i))
Else
mypickup = mypickup & ", " & Trim(tmp(i))
End If
End If
Next
End Function

keizi
 
D

DHANANJAY

Thank You very much,Kounoike.

Unfortunately, I am not a technical person and I tried running the
formula you have so painstakingly developed. When I ran that formula,
there is an error line, "subject or function not defined". When this
prompt appears, the word 'Mypickup' is highlighted. In fact, I just
copied the formula you wrote and did no change whatsoever to it.

I would appreciate if you take time to teach me how the formula works
so that I can better run it. Thank You
 
K

kounoike

Hi DHANANJAY

sorry for your inconvenience. i'm not sure this is the cause of error.
declaration of function must be written on one line or recognised as
on one line(this means if you want to write it into multiple lines,
you must break statement using the line-continuation charactor
, a space followed by an underscore.)
my guess is when you copy the code below, it may spread into
two lines.

Function mypickup(ByVal s, ByVal sw As String, Optional del As String = ",") As
String

so, try to make statement above into one line as look like
Function mypickup(----------) As string
also, i assume data is populated in columns("b").

keizi
 
D

DHANANJAY

Dear kounoike,

Thank You so much. You guessed it right I did little more than what
you instructed and I got the whole thing right.

Thank You Again.

Lots of Man Hours have been saved solely due to your expertise here in
Bangalore City, India.
 
R

Ron Rosenfeld

Hello everybody,

I have addresses entered as in below in several rows. Note that the
entire address for each individual is put in a single cell as text. All
cells have commas though and "text to column" option is completely
ruled out. There are about 50000 such entries and they do not contain
words in fixed order. For instance, in the following example, say the
rows are in rows B1 to B1000, I wish to place a formula in C1 to C1000
and find if the corresponding cell in B1 to B1000 has the word "Taluk"
and if so, to also read the word or words before "Taluk" - all occuring
until a comma and then to copy the entire set of words in C1 to C1000.

Kempapura, Hasigala Post, Hoskote Taluk, Bangalore Rural
Upparahalli, Kumbalahalli Post, Hona Hoskote Taluk, Bangalore Rural
Dist. Pin. 562114
#42/10, Suresh Nilaya, Behind Dharmaraya Temple, Malinapet, Taluk,
Bangalore
J 36, Pipline Malleshwaram, Ganesha Block, Bangalore
Taluk,Viveknagar II stage, Sulibele Road, Bangalore Rural Dist


This is what I a single formula to accomplish :

In row 1 below, I need to get "Hoskote Taluk"
In row 2 below, I need to get "Hona Hoskote Taluk"
In row 3 below, I need to get "Taluk"
In row 4 below, I need to get nothing
In row 5 below, I need to get "Taluk"

Kempapura, Hasigala Post, Hoskote Taluk, Bangalore Rural
Upparahalli, Kumbalahalli Post, Hona Hoskote Taluk, Bangalore Rural
Dist. Pin. 562114
#42/10, Suresh Nilaya, Behind Dharmaraya Temple, Malinapet, Taluk,
Bangalore
J 36, Pipline Malleshwaram, Ganesha Block, Bangalore
Taluk,Viveknagar II stage, Sulibele Road, Bangalore Rural Dist


All support is appreciated.

DHANANJAY

1. Download and install Longre's free morefunc.xll add-in from
http://xcell05.free.fr/

2. If your string is in A1, use the following formula:

=REGEX.MID(A1,"(?<=\s|^)[\w\s]*Taluk")

copy/drag down as needed.


--ron
 
K

kounoike

Hi DHANANJAY

you're welcome.
but, Ron has showed us more simple and flexible solution.
thanks Ron for introducing a very useful web site.

by the way, Ron's formula works fine and applied it to the case which has
plural match. The manual says it can be used as array function, so i entered
as array formula but i could'nt get every matches, only first match.
Of course, i can get every match using index parameter, or i've misread
the manual?

i'll appreciate if someone show me the way.

Thanks
keizi
 
R

Ron Rosenfeld

Hi DHANANJAY

you're welcome.
but, Ron has showed us more simple and flexible solution.
thanks Ron for introducing a very useful web site.

by the way, Ron's formula works fine and applied it to the case which has
plural match. The manual says it can be used as array function, so i entered
as array formula but i could'nt get every matches, only first match.
Of course, i can get every match using index parameter, or i've misread
the manual?

i'll appreciate if someone show me the way.

Thanks
keizi

I don't see what you mean when you say "plural match" in the examples given.
In the examples given, I see no need for an array form.

Perhaps I am missing something?
--ron
 
K

kounoike

Ron Rosenfeld said:
I don't see what you mean when you say "plural match" in the examples given.
In the examples given, I see no need for an array form.

Perhaps I am missing something?
--ron

Thank you for your reply, Ron.

I'm Japaneses and my English is not as good enough to express what i want
to say as do in my mother tongue. but i'll try.
as you said, there is no need of array form in the examples given.
it's just my interest. if A1 has a data something like this

Kempapura, Hasigala Post, Hoskote Taluk, Bangalore Rural, #42/10,
Behind Dharmaraya Temple, Malinapet, Hona Hoskote Taluk

and want to extract every matches, How could i apply REGEX.MID for this?
i mean what i want in this case is "Hoskote Taluk" and "Hona Hoskote Taluk"

i can get these respectively by
=REGEX.MID(A1,"(?<=\s|^)[\s\w]*Taluk",1) and
=REGEX.MID(A1,"(?<=\s|^)[\s\w]*Taluk",2)
and i thought i could get these in the range by using as array formula,
as it is said to be able to use as array funntion.

Possibly, i misunderstood this function.

Thanks
keizi
 
R

Ron Rosenfeld

Ron Rosenfeld said:
I don't see what you mean when you say "plural match" in the examples given.
In the examples given, I see no need for an array form.

Perhaps I am missing something?
--ron

Thank you for your reply, Ron.

I'm Japaneses and my English is not as good enough to express what i want
to say as do in my mother tongue. but i'll try.
as you said, there is no need of array form in the examples given.
it's just my interest. if A1 has a data something like this

Kempapura, Hasigala Post, Hoskote Taluk, Bangalore Rural, #42/10,
Behind Dharmaraya Temple, Malinapet, Hona Hoskote Taluk

and want to extract every matches, How could i apply REGEX.MID for this?
i mean what i want in this case is "Hoskote Taluk" and "Hona Hoskote Taluk"

i can get these respectively by
=REGEX.MID(A1,"(?<=\s|^)[\s\w]*Taluk",1) and
=REGEX.MID(A1,"(?<=\s|^)[\s\w]*Taluk",2)
and i thought i could get these in the range by using as array formula,
as it is said to be able to use as array funntion.

Possibly, i misunderstood this function.

Thanks
keizi

OK, I think I understand what you are doing.

The function can return either a horizontal or vertical array. But to do so,
you need to specify the multiple arguments in the form of an array constant
within the function.

This behavior is similar to standard Excel functions.

For example, to return a vertical array of the 1st and 2nd instances, you could
enter:

=REGEX.MID(A1,"(?<=\s|^)[\w\s]*Taluk",{1;2})

as an array formula in adjacent rows. In other words, you could select B1 &
B2; enter the formula in B1 and then hold down <ctrl><shift> while hitting
<enter>. Excel should copy the formula into B2, and surround both with braces
{...}.

If you have some unknown number of matches, you could replace the array
constant in the above {1;2} with the function: ROW(INDIRECT("1:n")) where n is
the maximum number of matches that might be, and then enter the formula in
multiple cells as I described above.

Furthermore, if you require a horizontal array, change the array constant to
{1,2}.

Please note that I am using the default US separators in my equations. If your
Excel version has different separators, you may also need different separators
than I used in the array constant.


Also, you could obtain these results without using an array formula. For
example, in B1 you could enter the formula:

=REGEX.MID($A$1,"(?<=\s|^)[\w\s]*Taluk",ROW())

and copy/drag down. The ROW() would give you the sequential numbers for the
different matches.

To return in a horizontal fashion, you could, in B1, use the formula:

=REGEX.MID($A$1,"(?<=\s|^)[\w\s]*Taluk",COLUMN()-1)

and copy/drag across.


--ron
 
K

kounoike

Hi Ron

Uhmm..fine! I don't have any single word to ask you for this function further.
Maybe, i, by myself, could'nt find the usage of this function by any strech of
my imagination.
Any way I thank you very much. Have a happy new year.

Regards
keizi
 
Top