Importing a multiline .txt file

S

schnett

I want to import about 1000 records ( like the 2 below) into excel to
sort. What is the most efficient way to to import these to parse /
format these records to have one row as a record (vba ?? ) ? I know how
to import text files but it doesn't give the formatting options needed
for this file. I can provide details, if you need them.


Product Claim

--------------------------------------------------------------------------------

ITEM DETAILS 1 MODEL NUMBER:
E34FCGHJR465G
CODE NUMBER: 45657878345342679 NUMBER:
004753Z
MODEL: 3W45 PROD. ATTRIBUTE
3345678346
MODEL YEAR: 2002 PNC DESCRIPTION
456 4567278
PFP DESCRIPTION: MOUNT

CURRENT SERV. LIFE: 15789 MILES

REPAIR DATE: 20060123

PRODUCTION DATE: 20031002

SOLD DATE: 20050307

CLAIM DATE: 20060301

CHARGE PARTS CHARGE VENDOR RATE 0.6

TOTAL CLAIM AMOUNT: 3060 VENDOR AMOUNT:
10.36
CHARGE SPECIAL HANDLING VENDOR RATE 0.6

TOTAL CLAIM AMOUNT: VENDOR AMOUNT:

CHARGE LABOR CHARGE VENDOR RATE 0.6

TOTAL CLAIM AMOUNT: 1275 VENDOR AMOUNT:
4.356
PRR NUMBER: 1 TROUBLE CODE/DESC.
NOISE
PRR NUMBER: 2 TROUBLE CODE/DESC.
PLAY
HAYON N OUVRE PAS COMPLETEMENT

REMPLACER

REPAIR ACTION#: 1 PART NO:
904507S000
PART QUANTITY: 1 UNIT PRICE:
18.71
REPAIR ACTION#: 2 PART NO:
904517S000
PART QUANTITY: 1 UNIT PRICE:
11.89
REPAIR ACTION#: 3 OPERATION NO:
UM14A1
LABOR HOURS: 0.2

TOTAL CHARGEBACK CLAIM AMOUNT: 26.01

RECEIVER CLAIM NO: 015314060

BUSINESS TYPE PRODUCTION PART

STATE CODE NY

ADJSTMNT MEMO NO: 008677465


--------------------------------------------------------------------------------

ITEM DETAILS 2 MODEL NUMBER:
BLJALGA327EU5A----
CODE NUMBER: 1N4BA41E34C852439 NUMBER:
118340Z
MODEL: 6ZV2 PROD. ATTRIBUTE
3566892138
MODEL YEAR: 2004 PNC DESCRIPTION
84532 CASING LID TORS BAR/TAIL
PFP DESCRIPTION: LID ASSY-CASING

CURRENT SERV. LIFE: 6464 MILES

REPAIR DATE: 20060216

PRODUCTION DATE: 20030906

SOLD DATE: 20050331

CLAIM DATE: 20060301

CHARGE PARTS CHARGE VENDOR RATE 0.6

TOTAL CLAIM AMOUNT: 690 VENDOR AMOUNT:
4.14
CHARGE SPECIAL HANDLING VENDOR RATE 0.6

TOTAL CLAIM AMOUNT: VENDOR AMOUNT:

CHARGE LABOR CHARGE VENDOR RATE 0.6

TOTAL CLAIM AMOUNT: 1705 VENDOR AMOUNT:
10.23
PRR NUMBER: 3 TROUBLE CODE/DESC.
POOR GENERAL APPEARANCE
PRR NUMBER: 4 TROUBLE CODE/DESC.
INTERNAL FAILURE
CUSTOMER STATES THE RIGHT REAR CASING SUPPORT HAS COME APART SEE
SPECIAL ORDER PART
THE RIGHT REAR CASING STAY IS BROKEN OFF AT THE
BOTTOM REPLACED THE RIGHT REAR CASING STA
Y UM14AA 2

REPAIR ACTION#: 4 PART NO:
3566892138
PART QUANTITY: 1 UNIT PRICE:
6.9
REPAIR ACTION#: 5 OPERATION NO:
UM14A1
LABOR HOURS: 0.2

TOTAL CHARGEBACK CLAIM AMOUNT: 14.37

RECEIVER CLAIM NO: 015316900

BUSINESS TYPE PRODUCTION PART

STATE CODE MS

ADJSTMNT MEMO NO: 008679905
 
B

Bryan Hessey

If your data is all a fixed number of rows as per your sample, an
begins in A2, then in B2 put the formula:

=IF(LEFT($A2,12)="ITE
DETAILS",IF(COLUMN()=2,$A2,INDIRECT("$A"&ROW()+COLUMN()-2)),"")

and formula drag that across to column AG, then (whilst stil
selected) formula drag that (B2:AG2) down to the end of your data.

This should provide what you need on the first row of each item, chec
column AF

To extract the new form:

-note: To retain your current sequence as the final sequence:
in AH1 put 1, hold the CTRL key and formula drag this to the end o
your data to number each line-.

Select the whole sheet, and Copy, Paste Special, Values back ont
itsself

Delete column A

Select All data and Sort over column B, delete all blank lines.

Select All data and Sort over column AH (which is now column AG)

Delete columns AF and AG

Hope this helps

--
 
P

PY & Associates

The information provided appears to be one row of data, separated by a blank
row;
usually four fields per row except Trouble Code/Desc and the last 5 or 6
rows;
1st row identify as Item Details
last row identify as Adjustment Memo No

You are wanting to put field names across row 1
and put data from row 2 onwards so that you can sort.

We consider you need to use VBA to achieve this.
If this is one off job, we believe it is more cost effective to pay some
service provider to do.
 
B

Bryan Hessey

Well spotted, I read the question but not the data.

Row 1 appears to be 2 fields, sequence and Model Number but only one
':' seperator, row 2 has two ':' sperators, row 3 and 4 only one,
whilst others, like the Vendor rates, have none.

Could be an interesting VB parse unless the OP could re-extract
dropping the headers, I cannot see that the data will support any form
of text-to-columns separation.

--
 
B

Bryan Hessey

Just of interest for schnett, the attached formula (across Row 2) will
show roughly what will be required in specs for parsing your data, and
I guess highlight the need to, if possible, have the file supplied in a
set format.

Hope this assists you

File:http://www.excelforum.com/attachment.php?attachmentid=4663&d=1145536793

--

Bryan said:
Well spotted, I read the question but not the data.

Row 1 appears to be 2 fields, sequence and Model Number but only one
':' seperator, row 2 has two ':' sperators, row 3 and 4 only one,
whilst others, like the Vendor rates, have none.

Could be an interesting VB parse unless the OP could re-extract
dropping the headers, I cannot see that the data will support any form
of text-to-columns separation.

--


+-------------------------------------------------------------------+
|Filename: Book1v.zip |
|Download: http://www.excelforum.com/attachment.php?postid=4663 |
+-------------------------------------------------------------------+
 
P

PY & Associates

You should note that the field before Repair Action #1 is irregular. Who
knows what surprises lie in other set of data?

"Bryan Hessey" <[email protected]>
wrote in message
 
B

Bryan Hessey

Also that one line of data was split over two lines, and the second Set
was 2 lines shorter than the first, hence my comment that, if possible,
a more standard format would make life easier.

--
 
B

Bryan Hessey

Also that one line of data was split over two lines, and the second Se
was 2 lines shorter than the first, hence my comment that, if possible
a more standard format would make life easier.

However, with only 1,000 lines it might be quicker to a 'as-is' job an
manually fix it afterwards.

--
 
S

schnett

I want to thank you for all that formula coding that went into tha
attachment sheet.

This raw attachment is only a sample of the records. The records see
more regular than irregular.

I tried what you had suggested and it would have worked except there i
a blank space in front of all my characters. Here is the raw data file
Can you teach me what the formulas in order to extract data from thes
records just like you did before ?

Thank you

+-------------------------------------------------------------------
|Filename: Spc data.txt
|Download: http://www.excelforum.com/attachment.php?postid=4668
+-------------------------------------------------------------------
 
B

Bryan Hessey

Hi,

A quick play with your file, opened in MS Word, replaced " " (3
spaces) by tab, saveas DOS text. In Excel, Data, import external, in
Wizard take Tab and colon : as delimiters, treat consecutive delimiters
as one, into A1

That gave A to L in the attached.

There were two spaces, not 1, in front of your data.

Set the formula in H2 as

=IF(LEFT(B2,2)=" ",MID(B2,3,999),IF(LEFT(B2,1)="
",MID(B2,2,999),IF(B2<>"",B2,"")))

copied across, and bulk copied down.

That gave columns H to L

With that range still selected, did Copy and Paste Special Values into
cell N2

That gave columns N O P and Q as your partially cleaned data.

Columns A to M can be deleted, they were just to show.

Does this help?

will try the new formula n that

--
I want to thank you for all that formula coding that went into that
attachment sheet.

This raw attachment is only a sample of the records. The records seem
more regular than irregular.

I tried what you had suggested and it would have worked except there is
a blank space in front of all my characters. Here is the raw data file.
Can you teach me what the formulas in order to extract data from these
records just like you did before ?

Thank you !


+-------------------------------------------------------------------+
|Filename: Spc.zip |
|Download: http://www.excelforum.com/attachment.php?postid=4669 |
+-------------------------------------------------------------------+
 
B

Bryan Hessey

Sample file after following the F2 formula, copy-paste special values t
a new sheet, delete columns A to E, sort = header row, column
ascending, remove all non-required rows.

As previously stated the bottom portion of your sets is a littl
non-fixed format and some manual intervention seems required, but i
may be more meaningful to you.

Hope this helps

--

Bryan said:
Hi,

A quick play with your file, opened in MS Word, replaced " " (
spaces) by tab, saveas DOS text. In Excel, Data, import external, i
Wizard take Tab and colon : as delimiters, treat consecutive delimiter
as one, into A1

That gave A to L in the attached.

There were two spaces, not 1, in front of your data.

Set the formula in H2 as

=IF(LEFT(B2,2)=" ",MID(B2,3,999),IF(LEFT(B2,1)=
",MID(B2,2,999),IF(B2<>"",B2,"")))

copied across, and bulk copied down.

That gave columns H to L

With that range still selected, did Copy and Paste Special Values int
cell N2

That gave columns N O P and Q as your partially cleaned data.

Columns A to M can be deleted, they were just to show.

Does this help?

-AMENDED POST

The formula to post in F2 is

=IF(LEFT($A2,12)="ITE
DETAILS",OFFSET($A$2,(INT(ROW()-2/4)-1)+INT((COLUMN()-6)/4),(MOD((COLUMN()-6),4)),1,1),"")

then formula copy that to ED - then, whilst still highlighted
bulk-formula copy to end of data

note, ED is one column too many, but where ED does not contain ITE
DESCRIP then an error has occurred and manual adjustment is required
sometimes by deleteing a row after joining data bits, sometimes b
inserting a row.


-

+-------------------------------------------------------------------
|Filename: Spc4.zip
|Download: http://www.excelforum.com/attachment.php?postid=4671
+-------------------------------------------------------------------
 
P

PY & Associates

Hi Bryan

I studied your solution carefully, well done.

I was considering an alternative approach and wish to share with you. My
proposal is:

use text to column, delimited with "space" and ":" on the whole file;

Heading row
cells(1,1)="Item details"
cells(1,2)="Model nr"
cells(1,3)="Code nr"
etc

for data row=2 to last data row
now Find row number containing word "Item" (as rownr)
then refer to relevant data using rownr + "nr of rows down" and colnr (which
can be counted easily)
cells(2,1)=cells(rownr,3)
cells(2,2)=cells(rownr,6)
cells(2,3)=cells(rownr+1, 3)
etc
next row

for description field which has been broken into multiple cells, we can
concatenate them back

"Bryan Hessey" <[email protected]>
wrote in message
 
B

Bryan Hessey

PY,

Thank you for your response.

Whichever option the OP chooses I think there will always be a need to
manually adjust some data. In the trial data I noticed two occasions
where the data had split over two lines, and with no real identifier to
detect these splits (ie, the supposed next line may or may not be
present) I see no easy way to automate all parts of that concatenation,
plus, not all fields seem present on all items. However, with a mere
1,000 or so records it would be more difficult to code than to fix
manually. Where 98% of the file is correct the OP could easily do these
fixes, The data is meaningful to the OP, more easily read and problems
detected, I was trying to produce a worksheet where the OP could see
those changes take effect as they were made rather than setting another
rule and re-importing the data.

Having said that I have no objections whatsoever to other suggestions
being made, after all it's the 'hive-mind' coupled with a few experts
that makes a forum a successful place, so please, go ahead and make any
suggestion.

Whatever gets the task completed for the OP is really the only
consideration, and that may be either by a more clever solution, or by
a more simple solution that can be understood, adapted, and used by the
OP.

Bryan

--
 
G

Guest

Interesting use of the Offset

Bryan Hessey said:
PY,

Thank you for your response.

Whichever option the OP chooses I think there will always be a need to
manually adjust some data. In the trial data I noticed two occasions
where the data had split over two lines, and with no real identifier to
detect these splits (ie, the supposed next line may or may not be
present) I see no easy way to automate all parts of that concatenation,
plus, not all fields seem present on all items. However, with a mere
1,000 or so records it would be more difficult to code than to fix
manually. Where 98% of the file is correct the OP could easily do these
fixes, The data is meaningful to the OP, more easily read and problems
detected, I was trying to produce a worksheet where the OP could see
those changes take effect as they were made rather than setting another
rule and re-importing the data.

Having said that I have no objections whatsoever to other suggestions
being made, after all it's the 'hive-mind' coupled with a few experts
that makes a forum a successful place, so please, go ahead and make any
suggestion.

Whatever gets the task completed for the OP is really the only
consideration, and that may be either by a more clever solution, or by
a more simple solution that can be understood, adapted, and used by the
OP.

Bryan
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top