Why Can't I Import a Text File that is 1.6Gig

G

Guest

I structured a table with 56 fields. The total characters for these 56 fields
is 566. There are 2,923,797 records in the text file. When I attempt to
import into the existing table, I get an *Invalid Argument* error message.
The *.mdb filesize (using MS-Explorer) becomes 2,095,108 KB. After I compact
& repair the db it returns to 256 KB.
Apparently MS-Access requires additional 'overhead' which is also eating up
space; a colleague loaded a 1.3 gig file successfully, leaving the *.mdb file
just over 2 gig in size -- when he compacted & repaired the db it compressed
down to 1.3 gig. Am I to assume that MS-Access requires >50% of an original
file's size as overhead in the process of importing?
Any insight is appreciated.
Thanks, Mark
 
J

John Nurick

Hi Mark,

The details are complicated and I doubt whether anyone outside Microsoft
really understands them.

One thing to do is to remove some or all indexes from the table before
you import the data and re-create them afterwards. Updating the indexes
during the import increases the overhead.

Dividing a big file into chunks and importing them one at a time helps
too: there seems to be less overhead involved at any one time. There's a
splitter utility in the Gnu text utilities, of which Windows versions
can be downloaded from http://unxutils.sourceforge.net/. Other Gnu
utilities let you cut out any fields you don't need.

Also, the field types you're using make a difference. Longs take up more
space than integers, for instance.

Simple multiplication (566 * 2,923,797) suggests that your file is about
1.5GB. This may be too big to import at all. But IME most huge text
files contain a lot of redundant data, and it may be possible to
normalise it into two or more smaller files (again using the Gnu
utilities or other text file tools such as Perl: all these are designed
to work with humungous files).
 
J

John Marshall, MVP

Been done that road before. The problem is that the Import command ingests
the entire text file before it processes the file. You need to use the old
fashion method of reading the text file line by line.

This is an example of one of my programs to load a text file into access.

Public Sub CnvAmdMgr()

Dim AmdNum As String, AmdType As String, db As Database, rst As Recordset
Dim i As Integer, InputData As String, MgrAdd1 As String, MgrAdd2 As String
Dim MgrCntry As String, MgrCoGovInd As String, MgrName As String
Dim MgrOwnerStatus As String, MgrPC As String, MgrProv As String
Dim OffNum As String, RecCnt, RecPtr As Integer
Dim KeyName As String
Dim StartTime As Date
StartTime = Now()

ProgName = "AmdMgr"

Debug.Print "Amd Mgr Start : "; Time;

CurrentDb.Execute "DELETE * FROM AmdMgr", dbFailOnError

Set db = CurrentDb()
Set rst = db.OpenRecordset("AmdMgr", dbOpenDynaset)

Open "c:\convert\CnvAmend06.txt" For Input As #1

RecCnt = 0

Do While Not EOF(1)

Line Input #1, InputData

RecPtr = 1
i = 7: OffNum = Mid(InputData, RecPtr, i): RecPtr = RecPtr + i
i = 4: AmdNum = Mid(InputData, RecPtr, i): RecPtr = RecPtr + i
i = 2: AmdType = Mid(InputData, RecPtr, i): RecPtr = RecPtr + i
i = 60: MgrName = Mid(InputData, RecPtr, i): RecPtr = RecPtr + i
i = 40: MgrAdd1 = Mid(InputData, RecPtr, i): RecPtr = RecPtr + i
i = 40: MgrAdd2 = Mid(InputData, RecPtr, i): RecPtr = RecPtr + i
i = 6: MgrPC = Mid(InputData, RecPtr, i): RecPtr = RecPtr + i
i = 3: MgrProv = Mid(InputData, RecPtr, i): RecPtr = RecPtr + i
i = 3: MgrCntry = Mid(InputData, RecPtr, i): RecPtr = RecPtr + i
i = 3: MgrCoGovInd = Mid(InputData, RecPtr, i): RecPtr = RecPtr + i
i = 3: MgrOwnerStatus = Mid(InputData, RecPtr, i): RecPtr = RecPtr + i

OfficialNo = Val(OffNum): AmendNo = Val(AmdNum)

KeyName = CleanAddress(MgrName, MgrAdd1, MgrAdd2, MgrPC, MgrProv,
MgrCntry, MgrCoGovInd)

With rst
.AddNew
!OffNum = Val(OffNum)
!AmdNum = Val(AmdNum)
!AmdType = Val(AmdType)
!MgrName = MixedCase(MgrName)
!MgrAddress = MgrAdd1
!MgrPC = MgrPC
!MgrProv = Val(MgrProv)
!MgrCntry = Val(MgrCntry)
!MgrCoGovInd = Val(MgrCoGovInd)
!MgrOwnerStatus = Val(MgrOwnerStatus)
!AddKey = KeyName
.Update
End With

RecCnt = RecCnt + 1: If RecCnt = Int(RecCnt / 20000) * 20000 Then
DoEvents
Loop

Close #1

rst.Close
Set rst = Nothing

Debug.Print Tab(40); "Elapsed : "; Format((Now() - StartTime), "nn:ss"); "
"; Format(RecCnt, "#,###,##0")

End Sub



John... Visio MVP

Need stencils or ideas? http://www.mvps.org/visio/3rdparty.htm
Need VBA examples? http://www.mvps.org/visio/VBA.htm
Common Visio Questions http://www.mvps.org/visio/common_questions.htm
Visio Wishlist http://www.mvps.org/visio/wish_list.htm
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top