Code Efficiency Suggestions

J

Job

I'm sorry Nick, it's been a long day/night.

TransDesc - - Text - Refers to the First 'column'. Is dependent upon the
location of [DocNum] in the first columns string. I find the value of
DocNum + the len(DocNum) and grab everything to the right of the DocNum
which is the TransDesc.

Thanks again for all your help!
 
N

Nick Hebb

This is a really long reply, so I'll break it into several parts in
case there's a length restriction for these posts.

Part 1:

Now that I've looked at it, I definitely wouldn't do any of this in
Excel. It could be done pretty easily in Access. [I know it's

frustrating when people say "easy", but hopefully what follows will be
"easy" to follow]. Here's what I would do:

1. In Access, create and save a table called tmpImportData and give it
5 text fields, name them Col1 - Col5.

2. From the file menu, select Get External Data > Import. An import
dialog box will open (essentially it's the same as a file open dialog

box). Change the file types to text, locate your raw data file, select
it, and click Import.

3. The Import text wizard will pop up. Click Advanced and set the field
names to Col1 - Col5, and click the skip checkbox second field since

you don't seem to use it. Then click the Save As button and give it a
Specification Name.

4. Finish the wizard, telling it to add the data to an existing table -
select tmpInputData.

5. Next, create a query in design view. I'll call it
qryCleanInputData. switch to SQL view and paste the following:

DELETE Left([tmpInputData]![Col1],1)=Chr(12) Or Not
IsDate(Left([tmpInputData]![Col1],10)) AS InvalidDate
FROM tmpInputData
WHERE (((Left([tmpInputData]![Col1],1)=Chr(12) Or Not
IsDate(Left([tmpInputData]![Col1],10)))=True));

{Note the change from char(12) to chr(12)}

6. Run the query to clean your data.
 
N

Nick Hebb

part II:


The following 2 steps could be done as a single step, but to make it
simple we'll split the tasks.

7. Create another query in design view. At the Show Table prompt
select tmpInputdata and click the add button. This query is where all
the

hard work comes. Save the query as qryMakeParsedInputDataTable.
Looking at all the formulas you had listed in your original post, at
least

half of the formula content was conditional logic based on invalid data
that we just stripped out in qryCleanInputData. Here's a breakdown

of what you need to do (you may want to do a lot of this off-line in a
text editor):

a. Replace the target ranges with the name you want for each column
followed by a ":", as the following example shows:

'Range("A" & i).FormulaR1C1 =
"=IF(RC[4]<>"""",TEXT(RC[4]*1,""yyyymm""),"""")"

Becomes

StatementDte:=IF(RC[4]<>"""",TEXT(RC[4]*1,""yyyymm""),"""")

b. Remove all the conditional logic in your formulas that depended upon
whether TransDte was invalid, since the query we created cleans that

up. So TransDte becomes:

TransDte:=TRIM(LEFT(TRIM(RC[10]),10))

c. Replace the references to cells with Col1 - Col5 references. The
references are in the format TABLE_NAME![FIELD_NAME], so TransDte now

becomes

TransDte:=TRIM(LEFT(TRIM(tmpInputData![Col1]),10)

d. Replace any other cell references that refer to newly created
columns to their new field names. So COAS becomes
COAS:=IF(LEFT(TRIM(tmpInputData![Col1]),5)="COAS:",TRIM(MID(tmpInputData![Col1],6,4)),[StatementDte])

{note the new reference to StatementDte}

e. Replace all your Excel functions with Access equivalents. The only
2 I see are:
- TEXT([TransDte], "yyymm") becomes Format$([TransDte], "yyyymm")
{Note, I've changed the Cell reference to the column name that will
be created in this query}
- IF(condition, true, false) becomes Iif(condition, true, false) {note
the extra "i" in "Iif"}.

f. Replace any remaining """"'s with "" if there are any.

g. At this point you should have a formula that matches every column in
the table. Take each formula and paste it into a separate field in

the query. In the query editor, select query from the menu and select
make table Query. Give the new table the name tmpParsedInputData.

h. Save the query and close it. Then run it by double-clicking on it.

i. Open the resulting table and check the contents. If it doesn't look
right, go back and check your formulas in the query.
 
N

Nick Hebb

Part II:


8. Create another query and select tmpParsedInputData in the Show table
dialog. Select all the fields from tmpParsedInputData and drag them

into the fields area. Go to Query on the menu and change the type to
Append query and select your target table as the table to append to.

Access should automatically figure out which fields append append
where, but you'll want to double check. Save the query as

qryAppendParsedInputData.

9. At this point you're ready to run the append query, but you should
be extra cautious. It's really easy to screw up data in a flash with

databases, so you should test everything first. I would make a copy of
the original target table and change the target table in the append

query to the copy. Run the append table and check the effect on the
data. Then I would import the data into the original table using your

traditional method. Then, create a find unmatched query using the
query wizard and verify that all the records created with the new
method

match the records created with the old method. If it all works, change
the target table in the append query back to the original target

table (instead of the copy).

10. Create a new query and save it as qryDeleteInputData. In SQL view
paste the folowing:

DELETE tmpInputData.Col1
FROM tmpInputData
WHERE (((tmpInputData.Col1) Like "*"));

11. Go to the macro screen and create a new macro. Add the following
Actions
- OpenQuery (query name = qryDeleteInputData)
- DeleteObject (type = table, name = tmpParsedInputData)
- TransferText (specification name = to the one you saved earlier, file
name = your source file, table name = tmpInputData)
- OpenQuery (query name = qryCleanInputData)
- OpenQuery (query name = qryMakeParsedInputDataTable)
- OpenQuery (query name = qryAppendParsedInputData)
- DeleteObject (type = table, name = tmpParsedInputData) ' run again to
clean up
- OpenQuery (query name = qryDeleteInputData) ' run again to clean up

You might also want to Repair and Compact the database since theiur was
a lot of data created and deleted:
- RunCommand (command = RepairDatabase)
- RunCommand (command = CompactDatabase)

Save the macro and you're set.
 
N

Nick Hebb

Note, the posts were submitted in order, but show up out of order.
also, the "Part II" that starts with step 8 should be Part III.
 
J

Job

Nick Thanks for the Code! I have a question, when determining the 'Fund' in
Excel I have to use an offset as it is two lines below and it doesn't tell
the fund on the actual line of the fund number. When we clean the
irrelevant lines of code, we now no longer know where the fund is..for
example in the clean column you'll see something like "This is the fund
description ZO09876" the "This is the fund description" can be anything so
it would be difficult to key off of any type of text there. If you just set
the column to extract the last word in that column, you'd also get all kinds
of other data as that column contains most of the relevant data. Any
thoughts?
 
N

Nick Hebb

Create another make table query, and go into SQL View and paste the
following:

SELECT tmpInputData.ID, tmpInputData.Col1, tmpInputData.Col2,
tmpInputData.Col3, tmpInputData.Col4, tmpInputData.Col5,
tmpInputData_1.Col1 AS Fund
FROM tmpInputData LEFT JOIN tmpInputData AS tmpInputData_1 on
tmpInputData.ID = tmpInputData_1.ID +2

This query grabs Col1 from the row 2 rows down. Because the syntax
isn't Query Designer friendly, you won't be able to create or view the
design in Design view and will get a message box telling you so.

Add an OpenQuery step to your macro - before the qryCleanInputData.
You will also need to base all your other queries off of the table
created here (whatever you decide to name it). Also, I would create
DeleteObject macro steps to remove this table after your operation is
complete.
 
J

Job

Nick,

I failed to post my thanks for all your help! It helped tremenously and
opened my eyes to new possibilities regarding text streaming.

Thanks again for all the work!

Cheers,

Job
 
N

Nick Hebb

No problem. Sometimes I have this skewed sensibility that makes it more
fun to work on other people's problems than my own!

By the way, did you get this to work? If so, how much did you cut the
import time by? Just curious.

--Nick
 
G

Guest

Job,

I will preface this post with the admission that I have not read all 31
posts in their entirety, so pardon if I repeat some advice that you’ve
already received.

I have a similar application in which I access several text files and import
data into Excel. I initially started by importing the entire file and then
deleting the lines I didn’t need. My suggestion is similar to Nick’s
suggestion about ‘scrubbing’ the data and dumping the needed data into a new
text file. You can do the same but put it in Excel.

In the code below I read a file and only import the lines that I want and if
I reach 65400 (I leave a little padding for another reason) I move the active
cell to the first row of the next column. Since all the data I am importing
is never seen by the user (data within each line is extracted) I am not
concerned with the format of the cells. This cuts down on the number of tabs
I create as well as the process time.

One file was 500,000+ lines. Reading each line in and then deleting
unneeded lines took upwards of 30 minutes. When I only import the lines I
need the same file took 20 seconds to process. Now I have an application in
which I must read in several different text files each with several thousand
lines and the 20 seconds each seems like an eternity….I guess it just goes to
show you, faster is never fast enough.

Here’s the code: (I have left out variable declaration and the like to
conserve space)

Sub GetData()
Application.ScreenUpdating = False
Application.Calculation = xlCalculationManual
Application.StatusBar = "Retrieving file"
Filename = Application.GetOpenFilename("Text Files (*.txt), *.txt")
If Filename = "False" Then End
Application.Cursor = xlWait
filenum = FreeFile()
Open Filename For Input As #filenum
Sheet1.Cells(1, 1).Select
Do While Seek(filenum) <= LOF(filenum)
Line Input #filenum, ResultStr
readline = LTrim(ResultStr)
If InStr(1, readline, "TEXT_TO_FIND-1") > 0 Then
importline = True
ElseIf InStr(1, readline, "TEXT_TO_FIND-2") > 0 Then
importline = True
Else
importline = False
End If
If importline Then
Application.StatusBar = "Importing Row " & counter & " of " & Filename
ActiveCell.Value = readline
If ActiveCell.Row = 65400 Then
ColCounter = ColCounter + 1
Sheet1.Cells(1, ColCounter).Select
Else
ActiveCell.Offset(1, 0).Select
End If
Else
Application.StatusBar = "Reading Row " & counter & " of " & Filename
End If
importline = False
counter = counter + 1
Loop
Close
Application.Cursor = xlDefault
Application.StatusBar = False
Application.Calculation = xlCalculationAutomatic
Application.ScreenUpdating = True
End Sub


The if conditions can change to suit the data you are looking to import. If
it is a certain string that you are looking for, just replace
"TEXT_TO_FIND-1" with your string. Hope this is of some help.

Mike
 
J

Job

Hi Nick,

I did get it to work. It cut the time down to about 4 minutes. This is a
huge improvement over the orignal 30+ minutes.

Cheers!
 
J

Job

Hi Mike,

Thanks for the post. You hit it straight on. This is similar to what I'm
using other that the fact that the if's are very complex and I'm putting it
into Access instead of Excel.

Cheers!
Job
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top