Nested formula to search a text string and return specific text

B

Barbie

Hello.
I have a large spreadsheet that lists many text strings in a single column.
The text strings include the name of a company, product, platform, and
language. I would like to isolate the language from the text string and
populate the cells in another column with only the name of that language. The
text string does not follow a consistent format where there are the same
number of words or where the language is always in the same position in the
sequence. There are 24 different languages that are used in the text strings,
and some of the languages are more than one word, such as "Chinese
Traditional" or "Chinese Simplified."

Using the formula =IF(SEARCH("German",D44),"German") works if the word
German is part of the text string in cell D44. However, I need to be able to
add all of the other languages into a single formula to verify line by line
what language is in the text string.

When I try to add other languages to the formula, such as
=IF(SEARCH("German",D44),"German"),IF(SEARCH("Danish",D44),"Danish"), the
result is #VALUE!, which is not what I want.

Can this be accomplished through a formula, and if so, how? Would it work
better to set this up as a macro? If so, any guidance on how to do that would
be much appreciated.

Thank you!
 
P

Pete_UK

Here's an approach that illustrates how it could be done, although I'm
sure there must be shorter ways (but it's late here!!). Suppose your
text containing the language is in column D, starting with D1, and
assume you have just 4 languages in M1:M4. Put this formula in E1:

=IF(ISNUMBER(SEARCH(M$1,D1)),M$1,"")&IF(ISNUMBER(SEARCH(M$2,D1)),M
$2,"")&IF(ISNUMBER(SEARCH(M$3,D1)),M$3,"")&IF(ISNUMBER(SEARCH(M
$4,D1)),M$4,"")

and copy down. Here it is again but split manually so that you can see
how it is put together:

=IF(ISNUMBER(SEARCH(M$1,D1)),M$1,"")
&IF(ISNUMBER(SEARCH(M$2,D1)),M$2,"")
&IF(ISNUMBER(SEARCH(M$3,D1)),M$3,"")
&IF(ISNUMBER(SEARCH(M$4,D1)),M$4,"")

Notice how each term is very similar to the one before it - the only
change is the (absolute) row for the M cell reference. If you have 24
languages then the formula would be:

=IF(ISNUMBER(SEARCH(M$1,D1)),M$1,"")&IF(ISNUMBER(SEARCH(M$2,D1)),M
$2,"")&IF(ISNUMBER(SEARCH(M$3,D1)),M$3,"")&IF(ISNUMBER(SEARCH(M
$4,D1)),M$4,"")&IF(ISNUMBER(SEARCH(M$5,D1)),M$5,"")&IF(ISNUMBER(SEARCH
(M$6,D1)),M$6,"")&IF(ISNUMBER(SEARCH(M$7,D1)),M$7,"")&IF(ISNUMBER
(SEARCH(M$8,D1)),M$8,"")&IF(ISNUMBER(SEARCH(M$9,D1)),M$9,"")&IF
(ISNUMBER(SEARCH(M$10,D1)),M$10,"")&IF(ISNUMBER(SEARCH(M$11,D1)),M
$11,"")&IF(ISNUMBER(SEARCH(M$12,D1)),M$12,"")&IF(ISNUMBER(SEARCH(M
$13,D1)),M$13,"")&IF(ISNUMBER(SEARCH(M$14,D1)),M$14,"")&IF(ISNUMBER
(SEARCH(M$15,D1)),M$15,"")&IF(ISNUMBER(SEARCH(M$16,D1)),M$16,"")&IF
(ISNUMBER(SEARCH(M$17,D1)),M$17,"")&IF(ISNUMBER(SEARCH(M$18,D1)),M
$18,"")&IF(ISNUMBER(SEARCH(M$19,D1)),M$19,"")&IF(ISNUMBER(SEARCH(M
$20,D1)),M$20,"")&IF(ISNUMBER(SEARCH(M$21,D1)),M$21,"")&IF(ISNUMBER
(SEARCH(M$22,D1)),M$22,"")&IF(ISNUMBER(SEARCH(M$23,D1)),M$23,"")&IF
(ISNUMBER(SEARCH(M$24,D1)),M$24,"")

and it would leave a blank cell if none of the languages are present
in the corresponding cell of column D. Just copy this down as far as
you need to.

One thing to be wary of, though, is that a cell containing Germany (in
the address or company name) will match with German, so this could be
a bit misleading.

Hope this helps.

Pete
 
T

T. Valko

Make a list of the 24 languages. If you have variations of a single language
like with your Chineses example, sort this list in ascending order. For
example:

H1: Chinese
H2: Chinese Simplified
H3: Chinese Traditional

Let's assume we have these text strings:

A1: This is Chinese Traditional
A2: Plain old Chinese
A3: Chinese Simplified is next

Enter this formula in B1 and copy down to B3:

=LOOKUP(1E+100,SEARCH(H$1:H$3,A1),H$1:H$3)
 
R

Ron Rosenfeld

Hello.
I have a large spreadsheet that lists many text strings in a single column.
The text strings include the name of a company, product, platform, and
language. I would like to isolate the language from the text string and
populate the cells in another column with only the name of that language. The
text string does not follow a consistent format where there are the same
number of words or where the language is always in the same position in the
sequence. There are 24 different languages that are used in the text strings,
and some of the languages are more than one word, such as "Chinese
Traditional" or "Chinese Simplified."

Using the formula =IF(SEARCH("German",D44),"German") works if the word
German is part of the text string in cell D44. However, I need to be able to
add all of the other languages into a single formula to verify line by line
what language is in the text string.

When I try to add other languages to the formula, such as
=IF(SEARCH("German",D44),"German"),IF(SEARCH("Danish",D44),"Danish"), the
result is #VALUE!, which is not what I want.

Can this be accomplished through a formula, and if so, how? Would it work
better to set this up as a macro? If so, any guidance on how to do that would
be much appreciated.

Thank you!

Set up a column with a list of all the possible languages. You can NAME the
range or use an absolute range reference.

Then enter an **array** formula of the type:

=INDEX(Language_List,MATCH(TRUE,ISNUMBER(SEARCH(Language_List,D44)),0))

To enter an **array** formula, hold down <ctrl><shift> while hitting <enter>.
Excel will place braces {...} around the formula.

This formula will return the language that is in D44, no matter where it is
within the cell.

In setting up your language list, the order may be important. If you have two
languages where the first word is the same, be sure to have the multi-word
entry prior to the single word entry, otherwise the formula will find the
single word entry.
--ron
 
R

Ron Rosenfeld

Hello.
I have a large spreadsheet that lists many text strings in a single column.
The text strings include the name of a company, product, platform, and
language. I would like to isolate the language from the text string and
populate the cells in another column with only the name of that language. The
text string does not follow a consistent format where there are the same
number of words or where the language is always in the same position in the
sequence. There are 24 different languages that are used in the text strings,
and some of the languages are more than one word, such as "Chinese
Traditional" or "Chinese Simplified."

Using the formula =IF(SEARCH("German",D44),"German") works if the word
German is part of the text string in cell D44. However, I need to be able to
add all of the other languages into a single formula to verify line by line
what language is in the text string.

When I try to add other languages to the formula, such as
=IF(SEARCH("German",D44),"German"),IF(SEARCH("Danish",D44),"Danish"), the
result is #VALUE!, which is not what I want.

Can this be accomplished through a formula, and if so, how? Would it work
better to set this up as a macro? If so, any guidance on how to do that would
be much appreciated.

Thank you!

One other thought: There could be confusion if a language is part of the
company name or product.

Is it possible that the language is always in the same sequential position
counting from the end of the string, or some other way of differentiating the
language from the same word in a company name?
--ron
 
B

Barbie

I am lucky that the language is never part of the company name. I had given a
lot of thought to the sequential order of where the name of the language
falls in the sequence from both beginning and end, but there wasn't
consistency there. Your previous post worked like a charm for me, and I have
learned a valuable lesson in Excel. Thanks so much!
 
B

Barbie

Thank you for this suggestion. I tried the array approach first, and that
worked, but out of curiosity, I tried your approach, too. It also worked like
a charm. It's good to know there are multiple ways to skin a cat! Thanks so
much!
 
B

Barbie

Thanks for the help! I understand the logic of this approach, but I got an
error when I tried to use the suggested formula (changing it, of course, to
match the range of cells where I had listed the languages). There were a
couple of posts that worked, and I appreciate your help. Thank you!
 
R

Rick Rothstein \(MVP - VB\)

What about adjacent or surrounding characters that are unique within the
text? For example, are the name surrounded by parentheses or encased in
quote marks while none of the remaining text are?
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top