XL2003: VBE Won't let me break into running code

C

CTB

Hello all,

I'm trying to manipulate huge text files (1 - 4 GB each) with the
TextStream object. Each file contains 6 months of data...I'm trying
to create 6 smaller files with one month of data each. The first step
in the process is reading through the whole file one line at a time so
I can get a total number of lines (any advice to do this faster/
alternate way would be greatly appreciated...I looked, didn't see a
"line count" property of the TextStream or File object so I could
avoid reading whole file). This part of the first file (smallest
file) took 5 - 10 mins to run. I use the total line count for
progress reporting in Excel's progress bar for the next
part...building the 6 monthly files from the main file.

At some point when the code is running (I don't know what point it
is...might change each time code is run), the VBE will no longer let
me break into running code ([Ctrl] + [Pause/Break]) for debuging...it
just stops running code...it no longer offers me the option to stop or
debug.

The time before last, I had a breakpoint set just after getting total
line count and a "Break when value is true" watch set just a few lines
before finishing the first monthly file (roughly 1/6th of the way
through the 2nd full read through of the main file) so I could debug
the code closing the first monthly file, creating the 2nd monthly
file, then continuing to write data. Breakpoint broke into code
without problems (VBE not yet tripped up), but "Break..." watch was
ignored.

This last time running code, it made it through the Total Line Count
run through, but threw an error (Data Type mismatch...if I remember
correctly). Instead of giving me the option to stop or debug...it
just stopped the code...I really need to be able to debug this.

XL/VBE goes into "Not Responding" mode (but code is still running)
usually when Windows Focus changes to another program. Maybe that is
when the VBE trips up and won't let a break-in.

Anyone else have issues with this? Is there anything that can be done
to fix this (other than staring at XL the whole time code is running
hoping a reminder won't pop up to take focus away from XL...hoping XL
won't go into "Not Responding" mode)? Is XL 2007's VBE more stable
that it will be able to handle this better (I do have XL2007 installed
on this machine)?

Thanks for any help anyone can provide,

CTB
 
G

GS

I've never worked with text files as large as GBs but I know that MB
files process fairly quickly using arrays and normal VB I/O (as opposed
to FileSystemObject).

Essentially, I read the entire file into a Variant variable in one shot
and then use the Split() function to dump it into another Variant
variable, resulting in a dynamic array. If the file contains data
records AND the first line contains the data fieldnames, the UBound of
the array is the record count. If the first line doesn't contain
fieldnames then the line count is the UBound+1.

To parse the data into separate months would be no problem by looping
the array checking which month each element belongs to and dumping that
into a String variable. If you set up a String variable for each month
then you should only have to loop the array once, and use a Select Case
construct to build the separate month strings.

Once you've parsed the file into appropriate month strings you can
write them back to individual files in one shot (each).

Keep in mind that handling files of this size will burden resources
some, but if your machine is fit for such tasks it should be no problem
to do.

You can run a progress bar in the array loop, OR report the progress in
the StatusBar for each line as it gets processed.

Summary:
1. Read source file into array in one shot.
2. Parse data into separate month strings using one loop.
3. Write month strings in one shot to separate files.
 
J

Jim Cone

It sounds as if you are running out of memory.
Do not use "ReadAll". It will do that.

Use: "Application.EnableCancelKey = Interrupt" and use DoEvents inside of any loops.

Your posted code did not show up in my newsreader.
The following function will take < 10 minutes to return the number of lines in a text file...
"---
Function NumberOfLines(sFilePath As String) As Long
Dim fs As Object
Dim f As Object
Set fs = CreateObject("Scripting.FileSystemObject")
Set f = fs.OpenTextFile(sFilePath, 8) ' 8 -> Appending
NumberOfLines = f.Line - 1
f.Close
Set f = Nothing
Set fs = Nothing
End Function
'---
Jim Cone
Portland, Oregon USA .
http://www.mediafire.com/PrimitiveSoftware .
(free and commercial excel programs)






"CTB" <[email protected]>
wrote in message
Hello all,

I'm trying to manipulate huge text files (1 - 4 GB each) with the
TextStream object. Each file contains 6 months of data...I'm trying
to create 6 smaller files with one month of data each. The first step
in the process is reading through the whole file one line at a time so
I can get a total number of lines (any advice to do this faster/
alternate way would be greatly appreciated...I looked, didn't see a
"line count" property of the TextStream or File object so I could
avoid reading whole file). This part of the first file (smallest
file) took 5 - 10 mins to run. I use the total line count for
progress reporting in Excel's progress bar for the next
part...building the 6 monthly files from the main file.

At some point when the code is running (I don't know what point it
is...might change each time code is run), the VBE will no longer let
me break into running code ([Ctrl] + [Pause/Break]) for debuging...it
just stops running code...it no longer offers me the option to stop or
debug.

The time before last, I had a breakpoint set just after getting total
line count and a "Break when value is true" watch set just a few lines
before finishing the first monthly file (roughly 1/6th of the way
through the 2nd full read through of the main file) so I could debug
the code closing the first monthly file, creating the 2nd monthly
file, then continuing to write data. Breakpoint broke into code
without problems (VBE not yet tripped up), but "Break..." watch was
ignored.

This last time running code, it made it through the Total Line Count
run through, but threw an error (Data Type mismatch...if I remember
correctly). Instead of giving me the option to stop or debug...it
just stopped the code...I really need to be able to debug this.

XL/VBE goes into "Not Responding" mode (but code is still running)
usually when Windows Focus changes to another program. Maybe that is
when the VBE trips up and won't let a break-in.

Anyone else have issues with this? Is there anything that can be done
to fix this (other than staring at XL the whole time code is running
hoping a reminder won't pop up to take focus away from XL...hoping XL
won't go into "Not Responding" mode)? Is XL 2007's VBE more stable
that it will be able to handle this better (I do have XL2007 installed
on this machine)?

Thanks for any help anyone can provide,

CTB
 
G

GS

spilly39 expressed precisely :
I'd like to see the code you are using to read the lines
- shouldn't be more than a few lines of CODE (as opposed to DATA that is!)

I've handled large log files, tho' not in the GB range

It's vital you do not read the whole file, but just a block at a time and
within that only one logical line at a time. My issue was that some data
files were exUnix (Lf) and other data ex Windoze (CrLf line terminated) and
VB Line Input statement needs CRLF data.

My solution was to write a class (UnixDosReader) with properties of filename,
size etc, and methods of fOpen, GetNextLine and fClose

I'm looking at my code now (not having used it since I retired)
I see I handled the file unblocking in VBA myself if it was a Unix file,
reading one block of data (of whatever size I chose) at a time.

What I did was to scan the 1st lump of the data at file open time to
determine if it was LF or CRLF data
If CRLF in block 1, then my GetNextLine method reads the file one line at at
time using std VB IO code
Line Input #pFileNo, dataLine

If it's Unix it gets a bit more complicated, with a Private Sub inside the
class:
ReadaBlock(s as String, Size as Integer) 'size is size of block buffer
which internally did
Get #pFile,,s '(read into s String argument)
You have to gwt it right at the end of the file, which doesn't fill your
standard blocksize.

Altogether there's 273 lines of code (of which the 1st 94 are solid comments)

I could send the code - it's a bit long to post here

BUT I have one little nagging worry...
FileSize is typed as Long, which on my 32 bit machine only gets to 2GB. I
assume you're on a 64 bit box? Just checking..
You'll have to look carefully at any such Typing matters if you want to use
my code

spilly


Thanks for stepping in!
I ran into the same issues with reading larger files in all at once.
Like you I've never done GB-size files but have had to read files as
large as 10MB in blocks as you describe. For me, however, this was a
onetime workaround and so I never bothered to refine it beyond that.

Your solution sounds exactly what this OP needs. I'd also be very
interested if you're willing to share.
 
C

Clif McIrvin

spilly39 said:
[ ]

Hi Garry

That was quick!

I see no reason not to share.
In fact I'd be willing to post an attachment here if that's allowed,
but I
doubt it is.

I'm not a big NG man, so I'm unsure of the Netiquette;
I'm really only subscribed (today!) becuase of another prob I've got
but when I saw this thread, it was a dead cert for my UnixDosReader
class...
Oh sod it! I'll try an attachment anyway - it's got the usual chances:
Success or Failure!
Health warning:
I'm a totally self taught VBA afficionado.
I'm always impressed with my code when I write it
but never impressed when I read it again two years later

Boy, that song sure sounds familiar! <grin>

300 lines of code? Just copy / paste into your reply and we'll all get
it just fine!

(as you may know by now, the attachment didn't make it through the
eternal september servers)

Welcome to the NGs! [Good tutors in this classroom!]
 
G

GS

Clif McIrvin formulated on Wednesday :
spilly39 said:
[ ]

Hi Garry

That was quick!

I see no reason not to share.
In fact I'd be willing to post an attachment here if that's allowed, but I
doubt it is.

I'm not a big NG man, so I'm unsure of the Netiquette;
I'm really only subscribed (today!) becuase of another prob I've got
but when I saw this thread, it was a dead cert for my UnixDosReader
class...
Oh sod it! I'll try an attachment anyway - it's got the usual chances:
Success or Failure!
Health warning:
I'm a totally self taught VBA afficionado.
I'm always impressed with my code when I write it
but never impressed when I read it again two years later

Boy, that song sure sounds familiar! <grin>

300 lines of code? Just copy / paste into your reply and we'll all get it
just fine!

(as you may know by now, the attachment didn't make it through the eternal
september servers)

Welcome to the NGs! [Good tutors in this classroom!]

Actually, I did get the attachment. As you can see, I'm using MesNews
as my reader and so attachments are supported. It shows up as a link
("File: unixdosreader.cls") in my message window.
 
G

GS

Hi Graham,
Thanks so much! I got your attachment as an attachment in the message
window.

This looks well thought out. The detailed comments are a blessing,
though this will take some time to digest. Can't wait to do a test
drive...<bg>
 
C

Clif McIrvin

spilly39 said:
Oh! I thought it had. My reader shows the attachment fine

OK, Here's the code anyway (and I've snipped the other thread text)
Enjoy!

[ ]

My bad. I'm using OE; and I completely missed the "This message has an
attachment" flag.

When I went back and checked, sure enough ... I can get the original
attachment also.

Thanks much for sharing your work!
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top