How to format imported data

G

Guest

I am importing a large MS-DOS address list file (.ls) into Excel. There are
over 18,000 entries and instead of draggin/dropping the data 1 by 1 into the
position I want, is there an easier way to do it? The data looks like this
(comes in as Cheshire label format....4 cols wide) and each name/add combo
takes up 4 lines in 1 column:

MR JOHN SMITH MS JANE DOE
APT 1
123 A ST 123 B ST
ANYTOWN US 00000 ANYTOWN US 00000

Our catalog people require the format to be separate columns: name, ad1,
ad2, c,s,z, and each entry should be a row. The entries have no column
headings, and are not separated by commas or anything to make a "text to
columns" easier either. Any help or suggestings please!
 
D

Dave O

Hi, Erika-
I started looking at this last night and bogged down, so I thought I'd
look again this morning. You've already imported this into Excel,
correct? If that's the case, can you safely make broad assumptions
such as: each section is no more than 4 lines, each section is
separated from the next section by only one blank line, the town name
is or is not always separated from its state by a comma, the zip code
is or is not always 5 numbers and not a zip + 4, the town, state, and
zip are always separated by two blank spaces (as in your example)?
These clues will help the conversion process.

And something else struck me: is this data from a legacy system that
was actively in use recently, or is it a file somebody found that dates
back to 1995 (when Windows 95 appeared)? I'm wondering if the data may
be too outdated to fuss with.

Dave O
 
G

Guest

Hi Dave,

Thank you so much for taking a look at this! I am far from an Excel wizard,
so any assistance I can get would be helpful. Yes, I have already imported
the data. When I import, the dialog box comes up where I can choose
delimited or fixed width, so FW is the one I end up choosing since the only
affecting delimiter is spaces (if that makes sense...if I choose comma,
semicolon, or even tab, nothing happens). Broad assumptions...each section
exactly 4 lines. If someone doesn't have an apartment or alternate address
to fill the space, that line is left blank. Each entry is separated by 2
blank rows (but no blank columns...once it gets down to row 28024, it'll
start a new column and there are 4 columns of equal length). The width of
the colums would be 27 max characters I think, although I set them as 29
chars in the fixed width section of data import just to be sure not to cut
any data out. There are no commas at all to separate fields, and zip is only
a 5 digit zip code. The spacing between state and zip is only 1 space.

As for this program, it is a legacy program. It was created specifically
for us back in the 70s or 80s when computers were first introduced into our
company and we still use it to this day. We just generated this list of
customers about 2 weeks ago. My boss ran a query to pull customers who were
active between 1995 and 2000, and I'm working with that data now.

If this info helps, the office computers run on Win98 and this program has
been compatible so far. And if I understood correctly, I was told that a
newer OS would not compatible with this DOS program because of FAT-32? I'm
not exactly sure, as I'm not an IT specialist. I'm using XP here on my
personal computer, and I can still import the address list we just generated
into Excel.

Let me know if you want me to email you the data so you can see what I'm
talking about...I can email the original .ls file. I'm wondering if maybe I
need a programmer in here to write a little program that will still pull the
data I need, but can arrange it in the way our catalog printer is requiring.
If you have any suggestions, I look forwards to hearing! And let me know if
you need more info from me...not sure if I'm explaining well or not.
 
D

Dave O

Hi, Erika-
Rather than send the .LS file, do you mind sending me the first hundred
lines or so of your Excel file? That should be enough to get the feel
for the data. Or, send mocked up data: your boss may get heartburn if
you send proprietary data to someone on the internet. Please send the
excel file to cyclezen ATSIGN yahoo DOT com (my blind email account).

I don't think it'll be too much of a chore: the worst will probably be
parsing the C, S, Z into individual fields.

Dave O
 
D

Dave O

Hi, Erika-
This code worked on the data I mocked up. It is not at all fault
tolerant: the code assumes there is always a zip code, always a state,
etc. I can include data validation and exceptions report in this code
if the number of errors you see is too great, but I figured this would
be a starting point.

Copy this code into a code module, and run it- it takes about 25
seconds to process (on my machine). Note you'll need to delete column
I before you send it on. Let me know what you think.


Sub LS_Parse()
'screen flicker off
Application.ScreenUpdating = False

'autocalc off
With Application
.Calculation = xlManual
End With

Dim LastRow As Long
Dim K As Byte
Dim Z As Byte

'format zip code column as text to preserve leading zero
Columns("L:L").Select
Selection.NumberFormat = "@"

'determine last row
Range("a1").Select
LastRow = ActiveCell.SpecialCells(xlLastCell).Row

'start of main loop
Do While ActiveCell.Row <= LastRow
'populate array
ReDim arrdata(1 To 7, 1 To 7) As String
For K = 1 To 4
For Z = 1 To 4
arrdata(K, Z) = ActiveCell.Offset(K - 1, Z - 1).Value
Next Z
Next K

'C S Z
For Z = 1 To 4
arrdata(7, Z) = Right(arrdata(4, Z), 5)
arrdata(6, Z) = Mid(arrdata(4, Z), Len(arrdata(4, Z)) - 7, 2)
arrdata(5, Z) = Mid(arrdata(4, Z), 1, Len(arrdata(4, Z)) - 8)
Next Z

'write to outcells
For Z = 1 To 7
For K = 1 To 7
ActiveCell.Offset(K - 1, 4 + Z).Value = arrdata(Z, K)
Next K
Next Z

ActiveCell.Offset(6, 0).Select

Loop
Application.ScreenUpdating = True

With Application
.Calculation = xlAutomatic
End With

Calculate
MsgBox "Done."
End Sub
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top