Text manipulation

P

paulinoluciano

Hi People,

I have a sequence of characters like:

AASSASDKASASDASFAFSASASADKASASAFPKQREWEAQEOKSPADAOEKOQPPDAOPSKAEPQ

This sequence must be put in cell A2.
Thus, I have to perform some specific operations in this text:

Example 1:
Rules:
a) Fragment the sequence before K but not always (you could have lost
cut).
b) Sequence is not cut if K is found before FP

Results:

AASSASDKASASDASFAFSASASADKASASAFPKQREWEAQEOKSPADAOEKOQPPDAOPSKAEPQ

0 lost cut = Cutting the sequence all the time in which K is present
(The subsequences of this process should be put in B column:
AASSASDK
ASASDASFAFSASASADK
ASASAFPKQREWEAQEOK
SPADAOEK
OQPPDAOPSK
AEPQ

1 lost cut = Cutting the sequence after the first K present in the
sequence (The subsequences of this process should be put in C column::
AASSASDKASASDASFAFSASASADK
ASASAFPKQREWEAQEOKSPADAOEK
OQPPDAOPSKAEPQ
AASSASDKASASDASFAFSASASADKASASAFPKQREWEAQEOK
SPADAOEKOQPPDAOPSKAEPQ

2 lost cut = = Cutting the sequence after the second K (just for the
third and following) present in the sequence (The subsequences of this
process should be put in D column:
AASSASDKASASDASFAFSASASADKASASAFPKQREWEAQEOK
SPADAOEKOQPPDAOPSKAEPQ

Repair that in some cases I need lost cuts in which you cut after 1, 2,
3, 4,... specific characters.
I have to specify such rules in some place of the sheet containing the
precursor text.
The rules are:

Cut after "XXX" (In this example I have put K but the some cell in the
sheet must contain what is the character in which the sequence will be
fragmented). In some cases it could be more than only one character
(e.g. K and R; nor necessarily together)
Cut before "XXX" (The cut may be after like previous example or before
the character)

Never before "XXX" (In some cases I have prohibitive situations; e.g.
It must not cut a sequence in K if K is preceeded by P or by RP)
Never after "XXX" (Same for after)

Number of times that the character could be missed prior cut "XXX" (In
some place of the sheet I must explicit how many characters could be
"lost" prior cut.

Thanks in advance,
Luciano
 
P

Pete_UK

Did you not get the answer to this in your thread which started 28th
December?

Pete
 
R

Ron Rosenfeld

Hi People,

I have a sequence of characters like:

AASSASDKASASDASFAFSASASADKASASAFPKQREWEAQEOKSPADAOEKOQPPDAOPSKAEPQ

This sequence must be put in cell A2.
Thus, I have to perform some specific operations in this text:

Example 1:
Rules:
a) Fragment the sequence before K but not always (you could have lost
cut).
b) Sequence is not cut if K is found before FP

Results:

AASSASDKASASDASFAFSASASADKASASAFPKQREWEAQEOKSPADAOEKOQPPDAOPSKAEPQ

0 lost cut = Cutting the sequence all the time in which K is present
(The subsequences of this process should be put in B column:
AASSASDK
ASASDASFAFSASASADK
ASASAFPKQREWEAQEOK
SPADAOEK
OQPPDAOPSK
AEPQ

1 lost cut = Cutting the sequence after the first K present in the
sequence (The subsequences of this process should be put in C column::
AASSASDKASASDASFAFSASASADK
ASASAFPKQREWEAQEOKSPADAOEK
OQPPDAOPSKAEPQ
AASSASDKASASDASFAFSASASADKASASAFPKQREWEAQEOK
SPADAOEKOQPPDAOPSKAEPQ

2 lost cut = = Cutting the sequence after the second K (just for the
third and following) present in the sequence (The subsequences of this
process should be put in D column:
AASSASDKASASDASFAFSASASADKASASAFPKQREWEAQEOK
SPADAOEKOQPPDAOPSKAEPQ

Repair that in some cases I need lost cuts in which you cut after 1, 2,
3, 4,... specific characters.
I have to specify such rules in some place of the sheet containing the
precursor text.
The rules are:

Cut after "XXX" (In this example I have put K but the some cell in the
sheet must contain what is the character in which the sequence will be
fragmented). In some cases it could be more than only one character
(e.g. K and R; nor necessarily together)
Cut before "XXX" (The cut may be after like previous example or before
the character)

Never before "XXX" (In some cases I have prohibitive situations; e.g.
It must not cut a sequence in K if K is preceeded by P or by RP)
Never after "XXX" (Same for after)

Number of times that the character could be missed prior cut "XXX" (In
some place of the sheet I must explicit how many characters could be
"lost" prior cut.

Thanks in advance,
Luciano

I don't know if you were the original poster in this thread. But there are
various solutions posted in this thread to an identical problem. You should
try them first, and then post back regarding success or problems.


--ron
 
P

paulinoluciano

Despite I have received several very good suggestions in this thread
that time they did not work very weel exactly for my problem. Therefore
I would like to know if someone could help me again. A representative
example of such proces can be see at
http://delphi.phys.univ-tours.fr/Prolysis/cutter.html
However, such engine is not as roboust as I woiuld need allowing to
specify all desired rules in an Excel sheet.
Thank you anyway.
Luciano
 
R

Ron Rosenfeld

Despite I have received several very good suggestions in this thread
that time they did not work very weel exactly for my problem. Therefore
I would like to know if someone could help me again. A representative
example of such proces can be see at
http://delphi.phys.univ-tours.fr/Prolysis/cutter.html
However, such engine is not as roboust as I woiuld need allowing to
specify all desired rules in an Excel sheet.
Thank you anyway.
Luciano

As I've written before, you can do this with Longre's add-in using Regular
Expressions.

B2:
=REGEX.MID($A$2,"(.*?(?<!FP)(K|$)){"&COLUMNS($B:B)&"}",ROWS($B$2:B2))

copy/drag down and across for 0, 1 and 2 lost cuts.

But some of your examples don't make sense to me.

For example:

================
1 lost cut = Cutting the sequence after the first K present in the
sequence (The subsequences of this process should be put in C column::
AASSASDKASASDASFAFSASASADK
ASASAFPKQREWEAQEOKSPADAOEK
OQPPDAOPSKAEPQ
AASSASDKASASDASFAFSASASADKASASAFPKQREWEAQEOK
SPADAOEKOQPPDAOPSKAEPQ
========================

I don't understand how you get lines 4 and 5.

Also, in your basic rules, you write:

"Sequence is not cut if K is found before FP"

But in your examples it seems as if your acting on a rule to "not cut if K is
found AFTER FP"

=========================

So far as having some variability, you could name two cells on your worksheet

CutAfter
Unless After

In your example, CutAfter would be K
UnlessAfter would be FP

And the formula would be:

=REGEX.MID($A$2,"(.*?(?<!"&UnlessAfter&")("&CutAfter&"|$)){"&COLUMNS($B:B)&"}",ROWS($B$2:B2))

Again, you drag down and across as far as necessary to accomodate all the
"cuts" and all the "lost cuts"

If these don't work, you will have to both explain more clearly, and also give
examples of the results of the formulas, and the expected results.


--ron
 
H

Harlan Grove

paulinoluciano wrote...
I have a sequence of characters like:

AASSASDKASASDASFAFSASASADKASASAFPKQREWEAQEOKSPADAOEKOQPPDAOPSKAEPQ

This sequence must be put in cell A2.
Thus, I have to perform some specific operations in this text:

Example 1:
Rules:
a) Fragment the sequence before K but not always (you could have lost cut).
b) Sequence is not cut if K is found before FP
....

The answers haven't changed since December/January. It's a near
certainty no one will provide answers much different from the ones you
were given then. Most of us tested our proposed solutions before we
posted them to the newsgroup, so *we* can get them to work. It seems
*you* couldn't get them to work.

So that begs the question whether you'd be able to implement any other
solutions other people provide. The odds would seem to be against that
happy possibility.

It'd make more sense for you to explain *IN* *DETAIL* how the solutions
you received 4 months ago didn't work. I recall one issue was that you
were using a Portuguese language version of Excel. Another problem was
VBA. With respect to the former, if you post in English language
newsgroups, translation into other languages is either up to you, or
you could crosspost to the appropriate Portuguese language Excel
newsgroup and hope that someone there could translate function calls
(or maybe come up with another solution). For the latter, in VBA if you
use the .Formula or .FormulaR1C1 properties of Range objects, Excel
*automatically* translates English language function calls into local
language function calls. You need to avoid using .FormulaLocal and
..FormulaR1C1Local properties to enter formulas with English function
calls.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top