Reading Zipped data from database record

P

peter.bremer

Hi all,

I've got a SQL Server database that contains zipped information stored
in (binary) image fields. To complicate things, this zipped data is
combined with plain-text data.

I've verified the zipped data to be readable by SharpZipLib by writing
the field contents to a file and using a binary editor to manually
strip the plaintext data in front of it.

I'm not very advanced in C#, and a total newbie with SharpZipLib, but
is it possible to write code that reads the zipped data from the
database field, and extracts it in memory for further processing?

Thanks, Peter
 
M

Marc Gravell

Well, since you have presumably already code that writes to a FileStream,
you could switch this stream for a MemoryStream, and then either leave "as
is", or extract a byte[] using ToArray().

Is this what you meant?

Marc
 
P

peter.bremer

I used another program to extract the binary data into a file, I didn't
write that self.

I've gotten as far as reading the binary field into a byte array, and
deleting the unwanted data with Array.Clear(). Howevery, ZipInputStream
requires a stream as input, not an array.

How would I go about reading the array as a Zip stream?

Peter
 
M

Marc Gravell

Something like (note: notepad job here... not compiler tested!)

byte[] rawCompressed = //TODO your code
using(MemoryStream compressed = new MemoryStream(rawCompressed))
using(...unzipStream to decompress from "compressed"...)
using(MemoryStream uncompressed = new MemoryStream())
{
const int BUFFER_SIZE = 1024; // or whatever
byte[] buffer = new byte[BUFFER_SIZE];
int bytesRead;
// copy between the streams decompressing as we go
while((bytesRead = unzipStream.Read(buffer, 0, BUFFER_SIZE) > 0) {
uncompressed.Write(buffer, 0, bytesRead);
}
// return the byte[]
return uncompressed.ToArray();
}

Marc
 
M

Marc Gravell

correction (missing bracket):

while((bytesRead = unzipStream.Read(buffer, 0, BUFFER_SIZE)) > 0) {
uncompressed.Write(buffer, 0, bytesRead);
}

As an aside, recall that working with /large/ byte[]s can be detrimental to
performance, as it can eat system memory. For small binaries this should be
fine, but for larger items I recommend treating the object as a stream when
possible, either only keeping the smaller compressed byte[] in memory, or
(if the scenario allows) streaming directly from the database server. Not
always possible if you want a nicely disconnected database...

Marc
 
P

peter.bremer

Okay, I'm almost there, but I get a weird error message from
SharpZipLib:

// read from database
SqlConnection con = new SqlConnection(CONNECTION_STRING);
con.Open();
SqlCommand cmd = new SqlCommand("SELECT Field FROM Table", con);
SqlDataReader rdr = cmd.ExecuteReader();
rdr.Read();
byte[] arr = (byte[])rdr["Field"];

// separate zipped data
int PLAINTEXT_SIZE = 76;
byte[] arr2 = new byte[arr.Length-PLAINTEXT_SIZE];
Array.Copy(arr, PLAINTEXT_SIZE, arr2, 0, PLAINTEXT_SIZE);
foreach (System.Byte b in (System.Byte[])arr2)
Debug.Write(String.Format("{0:x} ", b));
Debug.WriteLine("");

// decompress the data
MemoryStream compressed = new MemoryStream(arr2);
ZipInputStream zipstream = new ZipInputStream(compressed);
ZipEntry entry;
while ((entry = zipstream.GetNextEntry()) != null)
{
int size = 2048;
byte[] data = new byte[2048];
while (true)
{
size = zipstream.Read(data, 0, data.Length); // ZipException
if (size > 0)
foreach (byte b in data) { Debug.Write(String.Format("{0:x} ", b));
}
else
break;
}
}
zipstream.Close();


The zip-reading part is almost literally from the SharpZipLib
documentation, but on the second pass of zipstream.Read, I get the
cryptic error message "size mismatch: 88;256 <-> 44;51". On first run,
zipstream.Read only reads 51 bytes, this is not all of the data. (The
uncompressed data should be 256 bytes.)
 
P

peter.bremer

Aaarg... Found it... It should of course be:

Array.Copy(arr, PLAINTEXT_SIZE, arr2, 0, arr2.Length);

Thanks, the code seems to work great!

Peter
 
M

Marc Gravell

Are you sure that the plaintext size is 76 bytes? 76 characters is not
necessarily the same...

Also, you can skip some steps if you simply step past the text data within
the stream (.Position).

I will prepare an example as soon as I have grabbed #ZipLib...

Marc
 
M

Marc Gravell

Damn, you beat me...

You might find the following cleaner, though; start looking at "***"; note
the way you can skip past the junk very simply, without needing lots of
arrays.

Marc

using System;
using System.IO;
using ICSharpCode.SharpZipLib.Zip;
using System.Diagnostics;

class Program
{
static void Main()
{
// stitch some byte[]s together to get your mangled byte[];
// in this case, use the #ZipLib ".zip" file, but add
// 76 bytes of junk to the start
byte[] zipBin = File.ReadAllBytes(@"D:\084SharpZipLib.zip");
const int GARBAGE_BYTES = 76;
byte[] garbageBin = new byte[GARBAGE_BYTES];
Random rand = new Random(123456); // just a seed
rand.NextBytes(garbageBin);
byte[] inputBin = new byte[zipBin.Length + garbageBin.Length];
garbageBin.CopyTo(inputBin, 0);
zipBin.CopyTo(inputBin, garbageBin.Length);

// inputBin should now contain 76 bytes of junk then some zip data

// *** OK; now prepare to unscramble
using (MemoryStream inputStream = new MemoryStream(inputBin))
{
// skip the junk
inputStream.Position = GARBAGE_BYTES;
// create an unzip stream
using (ZipInputStream zipStream = new
ZipInputStream(inputStream))
{
// share a single buffer between all files
const int BUFFER_SIZE = 2048;
int bytesRead;
byte[] buffer = new byte[BUFFER_SIZE];
ZipEntry entry;
while ((entry = zipStream.GetNextEntry()) != null)
{
// for multi-file usage, assume one MemStream per item;
// for single-file, could declare higher up
byte[] outputBin;
using (MemoryStream outputStream = new MemoryStream())
{
// read through the compressed data
while ((bytesRead = zipStream.Read(buffer, 0,
BUFFER_SIZE)) > 0)
{
outputStream.Write(buffer, 0, bytesRead);
}
outputBin = outputStream.ToArray();
}
// output size (per file)
Debug.WriteLine(entry.Name + ": " +
outputBin.Length.ToString());
}
}

}
}
}
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top