Find and Replace in Binary File

M

mouac01

Newbie here. How do I do a find and replace in a binary file? I need
to read in a binary file then replace a string "ABC" with another
string "XYZ" then write to a new file. Find string is the same length
as Replace string. Here's what I have so far. I spent many hours
googling for sample code but couldn't find much. Thanks...

public static void FindReplace(string OldFile, string NewFile)
{
string sFind = "ABC"; //I probably need to convert these
to a byte array
string sReplace = "XYZ"; //but I don't know how.
int i;
FileStream fin = new FileStream(OldFile, FileMode.Open);
FileStream fout = new FileStream(NewFile,
FileMode.Create);
do
{
i = fin.ReadByte();
if (i != -1)
{
//I think I need to compare the byte being read in
//to the 1st sFind byte array here. If it matches
then
//store the position and compare the next byte.
//If all 3 bytes match then replace with sReplace
byte array.
//I'm just not sure how to do it.
fout.WriteByte((byte)i);
}
} while (i != -1);
fin.Close();
fout.Close();
}
 
P

Peter Duniho

Newbie here. How do I do a find and replace in a binary file? I need
to read in a binary file then replace a string "ABC" with another
string "XYZ" then write to a new file. Find string is the same length
as Replace string. Here's what I have so far. I spent many hours
googling for sample code but couldn't find much. Thanks...

From your code sample, it looks like you want the Encoding class. This
will allow you to take the strings and convert them to byte arrays. Then
you can use the "find" byte array as you search through the binary file,
and of course using the "replace" byte array to supply the replacement
bytes once you've found something to replace.

Of course, the key will be to choose an Encoding instance that is
appropriate to the file. Which one is correct will depend on how the
strings you're looking for are encoded in the file. The file itself won't
help you with this, assuming it's actually a binary file, so you'll just
have to know what's the correct encoding to use.

Pete
 
J

Jon Skeet [C# MVP]

Peter Duniho said:
From your code sample, it looks like you want the Encoding class. This
will allow you to take the strings and convert them to byte arrays. Then
you can use the "find" byte array as you search through the binary file,
and of course using the "replace" byte array to supply the replacement
bytes once you've found something to replace.

Of course, the key will be to choose an Encoding instance that is
appropriate to the file. Which one is correct will depend on how the
strings you're looking for are encoded in the file. The file itself won't
help you with this, assuming it's actually a binary file, so you'll just
have to know what's the correct encoding to use.

And of course you also need to consider the case where although the
*strings* are the same length, they may not produce byte arrays which
are of the same length...
 
M

Mufaka

I have done this before. The approach I took was to read the file bytes
into memory (because the files I was working with were small), move the
file out of the way, and then write a new file replacing the "find"
bytes with the "replace" bytes. So similar to your approach.

Here's a snippet from what I wrote (it was in 1.1 and was just "quick
and dirty" code):

ArrayList newBytes = new ArrayList(bytes.Length);
int ndx = 0;

for (int x = 0; x < bytes.Length; x++) // bytes is the original files bytes
{
if (bytes[x] == findBytes[ndx]) // findBytes is a byte[] from the
"find" string
{
if (ndx == findBytes.Length - 1)
{
for (int y = 0; y < replaceBytes.Length; y++) //
replaceBytes is a byte[] from the "replace" string
{
newBytes.Add(replaceBytes[y]);
}
ndx = 0;
}
else
{
ndx++;
}
}
else
{
if (ndx > 0)
{
for (int y = 0; y < ndx; y++)
{
newBytes.Add(findBytes[y]);
}
}
ndx = 0;
newBytes.Add(bytes[x]);
}
}

ndx is used to keep track of which byte in the findBytes should be
compared to the original files byte.
 
J

John B

Newbie here. How do I do a find and replace in a binary file? I need
to read in a binary file then replace a string "ABC" with another
string "XYZ" then write to a new file. Find string is the same length
as Replace string. Here's what I have so far. I spent many hours
googling for sample code but couldn't find much. Thanks...

public static void FindReplace(string OldFile, string NewFile)
{
string sFind = "ABC"; //I probably need to convert these
to a byte array
string sReplace = "XYZ"; //but I don't know how.
int i;
FileStream fin = new FileStream(OldFile, FileMode.Open);
FileStream fout = new FileStream(NewFile,
FileMode.Create);
do
{
i = fin.ReadByte();
if (i != -1)
{
//I think I need to compare the byte being read in
//to the 1st sFind byte array here. If it matches
then
//store the position and compare the next byte.
//If all 3 bytes match then replace with sReplace
byte array.
//I'm just not sure how to do it.
fout.WriteByte((byte)i);
}
} while (i != -1);
fin.Close();
fout.Close();
}
File.WriteAllText(
"Filename.txt",
File.ReadAllText(
"Filename.txt"
).Replace("ABC", "XYZ"));

As the others have pointed out, you need to look at the encoding class.
However if you know the encoding, you can then convert the bytes to a
string and just do string.Replace on them and re-write the file.

JB
 
P

Peter Duniho

As the others have pointed out, you need to look at the encoding class.
However if you know the encoding, you can then convert the bytes to a
string and just do string.Replace on them and re-write the file.

No, you can't. The OP specifically said this is a "binary file", which
implies that it's not a text file. Only a text file can be read entirely
as text safely.

Your code will fail the moment any bytes are encountered in the file that
are not valid for the target encoding. One hopes it will fail with an
error, but there's a small chance that it will simply happily interpret
binary data that's not a string as if it were, and then re-encode it as
some different sequence of binary data when writing back to the file.

In either case, it's not a suitable approach for use with binary files.

Of course, one thing that hasn't been mentioned yet is whether it's really
all that wise to do a search-and-replace on a binary file by assuming that
any sequence of bytes that looks like a specific string is actually that
string. Depending on the binary format, it's possible that there may be
sequences of bytes that look like strings but which are not. Replacing
those may well cause whatever is trying to use the binary file to fail
when it tries to read the modified version.

My experience has been that the likelihood of this being a problem is
low. But it's not non-existent. To truly safely replace strings within a
binary file, one really needs to understand the file format itself and
parse the file as binary, replacing strings only in parts of the file
known with 100% certainty to contain string data.

For some sort of internal tool, where the input data is limited to some
known subset of possibilities, and where it is either known that there
will never be binary data that looks like strings, or there's a reliable
way of detecting when that occurs, I think what the OP is asking for is
not unreasonable. However, it's not the sort of thing I'd let an average
user get their hands on.

Pete
 
T

Tom

I've used binary files a lot because of efficiency and compactness
when saving numerical data. Sometimes I create these files with a
precise header configuration, but the vast bulk of the files consist
of a data structure that I have specified. Without knowing the exact
data format of the file leaves you vulnerable to several potential
pitfalls. One can always ask if you know the format of the file and if
not ... why not? If you know the format ... then reading the data
structures, interpreting them, and writing modified values back to the
file is the path to take. Not a global search and replace at the file
level. If you don't know the format (hacking?) ... then I suggest
WinHex for a one time task on a specific file. Of course even the
simplest encryption methods will render WinHex all but useless.
Certainly most robust programs will not allow you to easily find a
text representation of a password (as an example). An even more
advanced program might bait the file with an unencrypted string of
text duplicating the password and then monitor the file for any signs
of circumvention. For example, does the bait match the decrypted
password stored elsewhere within the file? Lots of issues to consider.
 
M

mouac01

Thanks for all your comments. Here's what I'm trying to mimic in
VB6. I guess .NET is more restrictive in what type of file you can do
find/replace. Sounds too difficult in C# so I might just have to
revert back to VB6 for this one. Thanks...

Sub FindRplace()
Dim sBuffer As String
Dim ff As Integer
ff = FreeFile
Open txtFile.Text For Binary As #ff
sBuffer = Space$(LOF(ff))
Get #ff, 1, sBuffer
Close #ff
sBuffer = Replace(sBuffer, "ABC", "XYZ")
Open txtNew.Text For Binary As #ff
Put #ff, , sBuffer
Close #ff
End Sub
 
P

Peter Duniho

Thanks for all your comments. Here's what I'm trying to mimic in
VB6. I guess .NET is more restrictive in what type of file you can do
find/replace. Sounds too difficult in C# so I might just have to
revert back to VB6 for this one.

It's not difficult in C#. The code would not even be significantly
different from the VB6 code you posted, other than the lack of a
short-hand "replace" feature for dealing with byte arrays (which you
should be able to easily write yourself). And your VB6 code has all the
same problems we've warned that could come up doing it in C#.

But if you've already got VB6 code to do what you want, maybe you should
just stick with that if you'd rather not use the suggestions offered here.

Pete
 
J

John B

Peter said:
No, you can't. The OP specifically said this is a "binary file", which
implies that it's not a text file. Only a text file can be read
entirely as text safely.
<...>
Oops, missed the "binary file" bit.
Was under the impression that it was simply a text file.
Never mind.

JB
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top