Finding a string of unknown length in a string of unknown length, Help!

H

Hankjam

Hi

I've tried variations on MID, FIND, LEN, LEFT and Right and still not
able to do this.

I have strings that can be 89, 90, 91 or 92 characters long and
sometimes more.

I want to find the name of an enzyme in this string that can vary to
between 5 and 12 characters.

Something like this:

Plate Plan for Daughterboard 286338 (05_jul08_0036) for XXX(h) at 12.0
ATP Concentration
Plate Plan for Daughterboard 286338 (05_jul08_0036) for XXXX(h) at
100.0 ATP Concentration
Plate Plan for Daughterboard 286338 (05_jul08_0036) for XXXX(h) at
45.0 ATP Concentration

I want to list in separate cells:
1) the number of the Daughterboard (286338)
2) the name of the kinase (XXX(h))
3) The ATP concentration

and I don't want to column pharse, if that is the right term.

Many thanks for your time

Andrew j
 
P

Pete_UK

I'm a bit confused. Where is the name of your enzyme? What column is
your 89+ character string in? Do you want to split the 3 items out
into the next 3 columns for every row of data, or just for some that
match the enzyme? Does your data follow the example closely, with
variation in character length/position only in the ATP concentration
and what you call the kinase?

Pete
 
H

Hankjam

I'm a bit confused. Where is the name of your enzyme? What column is
your 89+ character string in? Do you want to split the 3 items out
into the next 3 columns for every row of data, or just for some that
match the enzyme? Does your data follow the example closely, with
variation in character length/position only in the ATP concentration
and what you call the kinase?

Pete
<snipped more>

My powers of written communication have never been a strong point.

"Plate Plan for Daughterboard 286338 (05_jul08_0036) for XXX(h) at
12.0 ATP Concentration"

is one example of a string copied into an excel cell A1.

From it I want to extract into three other cells the DB number, the
kinase name, which in this case is a three character string and the
isoform and the concentration of the ATP

286338 (=MID(A1,30,6), works pretty well)
XXX(h)
12.0 (=MID(A1,FIND(" at ",A1)+3,6), works but it is not very tidy)


The kinase name can vary in length, there are hundreds of them.
The ATP concentration can vary, we use 10 diffferent concentrations.

Cannot find anything to work for the variable and unkown nature of the
Kinase name, XXX(h) or XXXXXX(h)...

Cheers

Andrew j
 
P

Pete_UK

Okay, I put your example data in A1:A3 and the following formulae in
the cells stated:

B1: =MID(A1,30,SEARCH(" ",A1,30)-30)

C1: =MID(A1,SEARCH(") for",A1)+6,SEARCH(" at ",A1)-SEARCH(")
for",A1)-6)

D1: =MID(A1,SEARCH(" at ",A1)+4,FIND(" ATP",A1)-SEARCH(" at ",A1)-4)

and this gave the answers you wanted.

Better to copy the formulae from here into your sheet, but if you type
them in take care with the spaces and be wary of spurious line-breaks
that sometimes occur on the newsgroups.

Copy B1:D1 down for as many items as you have in column A.

Hope this helps.

Pete
 
R

RagDyeR

Depending on how accurate your examples were, these might or might not work
for you.

These are longer then what you tried because I added the possibility that
the data might have varying lengths.

With data starting in A1 and going down,

Enter this in B1 for the daughterboard number:

=MID(A4,FIND("board",A4)+6,6)


Enter this in C1 for the name of the kinase enzyme:

=MID(SUBSTITUTE(A4,"for","^^",2),FIND("^^",SUBSTITUTE(A4,"for","^^",2))+3,
FIND("(h)",A4)-FIND("^^",SUBSTITUTE(A4,"for","^^",2)))


Enter this in D1 for the ATP concentration:

=MID(A4,FIND(" at ",A4)+3,FIND("ATP",A4)-FIND(" at ",A4))


Then select B1 to D1, and copy down as needed.


--

HTH,

RD
=====================================================
Please keep all correspondence within the Group, so all may benefit!
=====================================================

I'm a bit confused. Where is the name of your enzyme? What column is
your 89+ character string in? Do you want to split the 3 items out
into the next 3 columns for every row of data, or just for some that
match the enzyme? Does your data follow the example closely, with
variation in character length/position only in the ATP concentration
and what you call the kinase?

Pete
<snipped more>

My powers of written communication have never been a strong point.

"Plate Plan for Daughterboard 286338 (05_jul08_0036) for XXX(h) at
12.0 ATP Concentration"

is one example of a string copied into an excel cell A1.

From it I want to extract into three other cells the DB number, the
kinase name, which in this case is a three character string and the
isoform and the concentration of the ATP

286338 (=MID(A1,30,6), works pretty well)
XXX(h)
12.0 (=MID(A1,FIND(" at ",A1)+3,6), works but it is not very tidy)


The kinase name can vary in length, there are hundreds of them.
The ATP concentration can vary, we use 10 diffferent concentrations.

Cannot find anything to work for the variable and unkown nature of the
Kinase name, XXX(h) or XXXXXX(h)...

Cheers

Andrew j
 
D

Dave Peterson

One more--with the data in A1:

=MID(A1,30,6)
=MID(A1,57,SEARCH(" at ",A1)-57)
=MID(A1,SEARCH(" at ",A1)+4,SEARCH(" atp concentration",A1)-SEARCH(" at ",A1)-4)

If you want the concentration to be a real number (not text):

=--MID(A1,SEARCH(" at ",A1)+4,
SEARCH(" atp concentration",A1)-SEARCH(" at ",A1)-4)

(Format that cell to show the number of decimals you want.)
 
R

Ron Rosenfeld

Hi

I've tried variations on MID, FIND, LEN, LEFT and Right and still not
able to do this.

I have strings that can be 89, 90, 91 or 92 characters long and
sometimes more.

I want to find the name of an enzyme in this string that can vary to
between 5 and 12 characters.

Something like this:

Plate Plan for Daughterboard 286338 (05_jul08_0036) for XXX(h) at 12.0
ATP Concentration
Plate Plan for Daughterboard 286338 (05_jul08_0036) for XXXX(h) at
100.0 ATP Concentration
Plate Plan for Daughterboard 286338 (05_jul08_0036) for XXXX(h) at
45.0 ATP Concentration

I want to list in separate cells:
1) the number of the Daughterboard (286338)
2) the name of the kinase (XXX(h))
3) The ATP concentration

and I don't want to column pharse, if that is the right term.

Many thanks for your time

Andrew j

Depending on how representative your data above is, the following VBA Macro
might do what you want.

To enter the macro, <alt-F11> opens the VB Editor. Ensure your project is
highlighted in the project explorer window, then Insert/Module and paste the
code below into the window that opens.

To use this, select some range that has the data to be parsed. Then <alt-F8>
opens the macro dialog box. Select the ParseEnzymeString macro and <RUN>. The
data you requested will be entered into the three adjacent columns.

If this is not what you want, you will need to be more specific as to your data
formats.

=========================================
Option Explicit
Sub ParseEnzymeString()
Dim c As Range
Dim re As Object, mc As Object, m As Object
Dim i As Long
Set re = CreateObject("vbscript.regexp")
re.IgnoreCase = True
re.Pattern = "^.*?(\d+).*?for\s+(\S+).*?(\d\S+)"
For Each c In Selection
Range(c.Offset(0, 1), c.Offset(0, 3)).ClearContents
If re.test(c.Value) Then
Set mc = re.Execute(c.Value)
For i = 1 To 3
c.Offset(0, i).Value = mc(0).submatches(i - 1)
Next i
End If
Next c
End Sub
==============================
--ron
 
H

Hankjam

Depending on how accurate your examples were, these might or might not work
for you.

These are longer then what you tried because I added the possibility that
the data might have varying lengths.

With data starting in A1 and going down,

Enter this in B1 for the daughterboard number:

=MID(A4,FIND("board",A4)+6,6)


Enter this in C1 for the name of the kinase enzyme:

=MID(SUBSTITUTE(A4,"for","^^",2),FIND("^^",SUBSTITUTE(A4,"for","^^",2))+3,
FIND("(h)",A4)-FIND("^^",SUBSTITUTE(A4,"for","^^",2)))


Enter this in D1 for the ATP concentration:

=MID(A4,FIND(" at ",A4)+3,FIND("ATP",A4)-FIND(" at ",A4))


Then select B1 to D1, and copy down as needed.

Hi Ragdyer

It surely does.

Many thanks

Yours

Andrew j
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top