Inserting a single character into the middle of a 3 GB file

R

R Avery

I wrote a program that analyzes a 1.5 GB text (CSV) file and outputs a
3.0 GB text file. However, it turns out that on the 2.3 something
millionth row, it is missing a comma. Therefore, when trying to import
the flat file into SQL Server, it is complaining about data type error.

What i would like to do is to write a VBA program that will just insert
a comma into the appropriate place in the file (I can figure out the
offset that i would like to place it at). If this is not possible
because of the way data is stored on disk, it is also possible for me to
remove a character from elsewhere in the line to make up the
difference. Therefore, what i want to do amounts to changing

20040315,34.349,20020401-45.69,13.34897273,100 into --->
20040315,34.349,20020401,-45.69,13.3489727,100

However, I do not know how to do this. Any help would be appreciated.
 
G

Guest

Sub AddComma()
'Assumes the data is in contiguous rows and stops when it hits a blank row
'adds a comma just before the "-" sign and puts the results in column "B"

Do Until ActiveCell.Value = ""
ActiveCell.Offset(0, 1).Range("A1").Select
ActiveCell.FormulaR1C1 = _

"=LEFT(RC[-1],FIND(""-"",RC[-1],1)-1)&"",""&RIGHT(RC[-1],LEN(RC[-1])-(FIND(""-"",RC[-1],1)-1))"
ActiveCell.Offset(1, -1).Range("A1").Select
Loop
End Sub

Not certain this is what you want, but I hope it helps
 
F

Fredrik Wahlgren

R Avery said:
I wrote a program that analyzes a 1.5 GB text (CSV) file and outputs a
3.0 GB text file. However, it turns out that on the 2.3 something
millionth row, it is missing a comma. Therefore, when trying to import
the flat file into SQL Server, it is complaining about data type error.

What i would like to do is to write a VBA program that will just insert
a comma into the appropriate place in the file (I can figure out the
offset that i would like to place it at). If this is not possible
because of the way data is stored on disk, it is also possible for me to
remove a character from elsewhere in the line to make up the
difference. Therefore, what i want to do amounts to changing

20040315,34.349,20020401-45.69,13.34897273,100 into --->
20040315,34.349,20020401,-45.69,13.3489727,100

However, I do not know how to do this. Any help would be appreciated.

Please explain *exactly* how you can figure out where the comma is missing.
Is it always before a hyphen? How is the CSV file generated? It is better to
fix what's causing the problem in the first case, don't you think? Right
now, it seems as if you are trying to fix a symptom rather than the actual
problem.

/ Fredrik
 
T

Tim Williams

Open your file for reading, open a new file for writing. Read
consecutive lines from the original file and write them to the new
file. When you get to the problem line, fix it. Write the rest of
lines from the original file.

Should not be a problem assuming you have the spare 3GB....

Tim.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top