Split 2 GB file

  • Thread starter Thread starter gary
  • Start date Start date
G

gary

When I try to open a 2 gb file, I get a "Disk Full" error.

How can I split the original file into smaller files without first
opening the original file?
 
gary said:
When I try to open a 2 gb file, I get a "Disk Full" error.

How can I split the original file into smaller files without first
opening the original file?


Don't really know what you are trying to do but yes, it is possible to split
a file into multiple parts to send over the internet or store onto CDs.

But to split the file into individual smaller parts and to open the file in
a program like Word or Excel or even a graphics program isn't possible
without the original file being opened in the program.

Now, if you let us know what program you are trying to use to "open" this 2
GB file in we may be able to help on a workaround. Also check the amount of
free space you have on your system drive (normally C:\.) If this file is on
a USB thumb (flash) drive copy it to the computer's hard drive first.

If any of the above helps let us know, if not please provide the program
name.
 
gary said:
When I try to open a 2 gb file, I get a "Disk Full" error.

How can I split the original file into smaller files without first
opening the original file?

Some suggestions here, for alternatives to TEXTPAD.

http://kb.mozillazine.org/Edit_large_mbox_files

( no file size limit listed here...

http://www.textpad.com/products/textpad/screenshots/index.html )

*******

There are plenty of other tools around, scripting languages
like "awk" or "perl", that can also be used to do stuff. But with
some luck, the tools in the Mozillazine article will allow you
to have a GUI while you work.

In Unix or Linux environments, "head" or "tail" can be used for
chopping up files, such as

head -n 1000 bigfile.txt > smaller_1.txt

head -n 2000 bigfile.txt | tail -n 1000 > smaller_2.txt

The advantage of commands like that, is they're "line oriented",
respecting breakage of your file, on sentence boundaries.

The first example command, grabs the first 1000 lines.

The second example command, extracts the second 1000 lines, using two
commands piped together. The ">" redirects output to the named new
file.

If the file being imported into the environment, appears to be
"one long line", then those programs won't work quite right.
You need to do termination character conversion in that case.
( <cr> to <cr><lf> or vice versa). Which I'll ignore for now,
as there is always "dd" to finish this task :-)

http://linux.die.net/man/1/head

http://linux.die.net/man/1/tail

You can even get ports of programs like this, so they'll run under Windows.
In my software collection, I can see them in "coreutils" package. An example
of that, is in an older post I made. Some of these ported packages have
"installers", but when I'm in a rush, I just copy the necessary files
into the current working directory and get the job done.

http://groups.google.ca/group/microsoft.public.windowsxp.general/msg/d63ee15fbd9ff9fc?dmode=source

These two links give the component parts, and still seem to be available,
even if the original main page for coreutils is gone. You should be
able to find "head" and "tail" in here, plus whatever DLLs they need.
You would use them in a Command Prompt window, where you'd "change directory"
or cd to the appropriate working directory, to do your work. Put a copy
of head, tail, the DLLs, into the current working directory with the
"big file", then issue the necessary commands, then wait ...

http://iweb.dl.sourceforge.net/sourceforge/gnuwin32/coreutils-5.3.0-bin.zip
http://iweb.dl.sourceforge.net/sourceforge/gnuwin32/coreutils-5.3.0-dep.zip

The command syntax in Windows should be exactly the same, for those ports.
And as long as the line termination characters, allow the tools to
recognize the end of each line, it'll work. Using a series of commands
like this, it'll take a dog's age to chop up the file, but you'll eventually
do it.

head -n 2000 bigfile.txt | tail -n 1000 > smaller_2.txt

*******

Now, if you're the kind of person "more comfortable with a chain saw", this
tool gets the same job done more rapidly. It will not respect sentence structure,
so the last line in each file, could be split in half for example.

http://www.chrysocome.net/dd

You work in a command prompt. Drop a copy of dd.exe into your working
directory.

dd if=bigfile.txt of=smaller_1.txt bs=1048576 count=1024

dd if=bigfile.txt of=smaller_2.txt bs=1048576 count=1024 skip=1024

That copies the first (1024 x 1048576) characters into smaller_1.txt .
The second command copies the second (1024 x 1048576) characters
into smaller_2.txt and so on. For the last command, I suspect
something like this would work, where the command copies until it
hits the end of file on input. This would copy everything from
2GB up to the end of bigfile.txt for example, making a third file.
The "skip" option, skips you X blocks along, before starting the
copy. The "bs" or "Block Size" field, is a multiple of 512 bytes,
which is a sector on older hard drives. The command really
likes binary numbers, at least for the block size field.

dd if=bigfile.txt of=smaller_3.txt bs=1048576 skip=2048

Commands like that, run anywhere from 13MB/sec to 60MB/sec,
depending on conditions.

HTH,
Paul
 
Back
Top