Extracting Specific Numbers in a String of Text

C

cardan

Hello,

I have what I think may be a simple question. I am working with some
XBRL. The way I am going about this problem is not the intent of the
code but it is a workaround. If someone can help me put the code into
excel and have it work properly I am all ears! Anyways,
I am copying and pasting strings of text (actually lines of xbrl code)
into an Excel worksheet. Each copy/paste is about 1000 rows long. Each
string comes in under one cell and is different in length but the
format is the same. Example lines are below. I need to extract out
the tag (RIAD4461 for example) and the number at towards the end that
is between brackets >###<. With my limited knowledge I can get the
RIAD4461 number with a left(cell,9) and then a right(cell,8) However I
am not sure how to get the number given their different number of
characters. Any help would be greatly appreciated.

Example Lines of Code
line 1  <cc:RIAD4461 contextRef="CD_508878_2008-09-30" unitRef="USD"
decimals="0">169000</cc:RIAD4461> (end line 1)

line 2 <cc:RIAD4462 contextRef="CD_508878_2008-09-30" unitRef="USD"
decimals="0">443000</cc:RIAD4462> (end line 2)

line 3  <cc:RIAD4463 contextRef="CD_508878_2008-09-30" unitRef="USD"
decimals="0">97000</cc:RIAD4463> (end line 3)
 
R

Rick Rothstein

Assuming all your text always has the "shape" you showed us, this formula
will return the number between the ">###<"...

=LOOKUP(9E+307,--LEFT(MID(A1,FIND(">",A1)+1,99),ROW($1:$99)))

--
Rick (MVP - Excel)


Hello,

I have what I think may be a simple question. I am working with some
XBRL. The way I am going about this problem is not the intent of the
code but it is a workaround. If someone can help me put the code into
excel and have it work properly I am all ears! Anyways,
I am copying and pasting strings of text (actually lines of xbrl code)
into an Excel worksheet. Each copy/paste is about 1000 rows long. Each
string comes in under one cell and is different in length but the
format is the same. Example lines are below. I need to extract out
the tag (RIAD4461 for example) and the number at towards the end that
is between brackets >###<. With my limited knowledge I can get the
RIAD4461 number with a left(cell,9) and then a right(cell,8) However I
am not sure how to get the number given their different number of
characters. Any help would be greatly appreciated.

Example Lines of Code
line 1 <cc:RIAD4461 contextRef="CD_508878_2008-09-30" unitRef="USD"
decimals="0">169000</cc:RIAD4461> (end line 1)

line 2 <cc:RIAD4462 contextRef="CD_508878_2008-09-30" unitRef="USD"
decimals="0">443000</cc:RIAD4462> (end line 2)

line 3 <cc:RIAD4463 contextRef="CD_508878_2008-09-30" unitRef="USD"
decimals="0">97000</cc:RIAD4463> (end line 3)
 
C

cardan

Assuming all your text always has the "shape" you showed us, this formula
will return the number between the ">###<"...

=LOOKUP(9E+307,--LEFT(MID(A1,FIND(">",A1)+1,99),ROW($1:$99)))

--
Rick (MVP - Excel)


Hello,

I have what I think may be a simple question. I am working with some
XBRL.  The way I am going about this problem is not the intent of the
code but it is a workaround. If someone can help me put the code into
excel and have it work properly I am all ears!  Anyways,
I am copying and pasting strings of text (actually lines of xbrl code)
into an Excel worksheet. Each copy/paste is about 1000 rows long. Each
string comes in under one cell and is different in length but the
format is the same. Example lines are below.  I need to extract out
the tag (RIAD4461 for example) and the number at towards the end that
is between brackets >###<. With my limited knowledge I can get the
RIAD4461 number with a left(cell,9) and then a right(cell,8) However I
am not sure how to get the number given their different number of
characters. Any help would be greatly appreciated.

Example Lines of Code
 line 1 <cc:RIAD4461 contextRef="CD_508878_2008-09-30" unitRef="USD"
decimals="0">169000</cc:RIAD4461> (end line 1)

line 2 <cc:RIAD4462 contextRef="CD_508878_2008-09-30" unitRef="USD"
decimals="0">443000</cc:RIAD4462> (end line 2)

line 3 <cc:RIAD4463 contextRef="CD_508878_2008-09-30" unitRef="USD"
decimals="0">97000</cc:RIAD4463> (end line 3)

Your assumption about the "shape" is correct. It is working well-even
though I can't explain the formula. the only issue is when the value
is 0. It returns an NA. I was able to fix this with an if(iserror
(.... Other than that, it is great. Thank you!
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top