Blob and XML

S

shapper

Hello,

This might be a strange question but is there a way to save a file in
a XML file?
Something similar to saving a Blob in a SQL Database.

Is this possible with C#?
I would like in the future to get data from a SQL table where on of
the columns holds a Blob.
I wonder if I could save that Blob on a XML file ... And if in the
future I can get the file for download.

Thanks,
Miguel
 
P

Peter Duniho

Hello,

This might be a strange question but is there a way to save a file in
a XML file?
Something similar to saving a Blob in a SQL Database.

Is this possible with C#?
I would like in the future to get data from a SQL table where on of
the columns holds a Blob.
I wonder if I could save that Blob on a XML file ... And if in the
future I can get the file for download.

As David says, you can use a character-based encoding for binary data,
such as Base64, to store binary data in your XML.

In fact, here are a couple of extension methods I wrote a while back to do
just that:

public static XCData ToXmlCData(this byte[] rgb)
{
StringBuilder sb = new StringBuilder();

using (XmlWriter writer = new XmlTextWriter(new
StringWriter(sb)))
{
writer.WriteBase64(rgb, 0, rgb.Length);
}

return new XCData(sb.ToString());
}

public static byte[] FromXmlCData(this XElement xcdata)
{
using (MemoryStream streamOut = new MemoryStream())
using (XmlReader reader = new XmlTextReader(new
StringReader(xcdata.ToString())))
{
byte[] rgb = new byte[4096];
int cb;

reader.MoveToContent();
while ((cb = reader.ReadElementContentAsBase64(rgb, 0,
rgb.Length)) > 0)
{
streamOut.Write(rgb, 0, cb);
}

return streamOut.ToArray();
}
}

Unfortunately, that's the best I could come up with. My main issues are
that I had to use the older System.Xml Base64 support because
System.Xml.Linq doesn't include it, and the only way I saw to convert from
Base64 back to binary easily was to convert the entire node to a string
and read the string (which IMHO is a problem because there's already a
perfectly good string attached to the XElement object containing the data
you want to read, but the .NET Base64 support doesn't know what to do with
a raw string like that).

You can address those problems reasonably easily by using a third-party
Base64 implementation, or writing your own. But I wanted to see if I
could get a "pure .NET" solution. Turns out, I could, even if it is a
little awkward.

For relatively small sizes of data, I think it should work just fine
though. You probably don't want to be embedding hundreds or thousands of
megabytes into your XML file anyway. Base64 will inflate your data some
20-30%, which is fine for small chunks of data but can get to be a real
performance and disk space problem for very large data.

Pete
 
S

shapper

This might be a strange question but is there a way to save a file in
a XML file?
Something similar to saving a Blob in a SQL Database.
Is this possible with C#?
I would like in the future to get data from a SQL table where on of
the columns holds a Blob.
I wonder if I could save that Blob on a XML file ... And if in the
future I can get the file for download.

As David says, you can use a character-based encoding for binary data,  
such as Base64, to store binary data in your XML.

In fact, here are a couple of extension methods I wrote a while back to do  
just that:

         public static XCData ToXmlCData(this byte[] rgb)
         {
             StringBuilder sb = new StringBuilder();

             using (XmlWriter writer = new XmlTextWriter(new  
StringWriter(sb)))
             {
                 writer.WriteBase64(rgb, 0, rgb.Length);
             }

             return new XCData(sb.ToString());
         }

         public static byte[] FromXmlCData(this XElement xcdata)
         {
             using (MemoryStream streamOut = new MemoryStream())
             using (XmlReader reader = new XmlTextReader(new  
StringReader(xcdata.ToString())))
             {
                 byte[] rgb = new byte[4096];
                 int cb;

                 reader.MoveToContent();
                 while ((cb = reader.ReadElementContentAsBase64(rgb, 0,  
rgb.Length)) > 0)
                 {
                     streamOut.Write(rgb, 0, cb);
                 }

                 return streamOut.ToArray();
             }
         }

Unfortunately, that's the best I could come up with.  My main issues are  
that I had to use the older System.Xml Base64 support because  
System.Xml.Linq doesn't include it, and the only way I saw to convert from  
Base64 back to binary easily was to convert the entire node to a string  
and read the string (which IMHO is a problem because there's already a  
perfectly good string attached to the XElement object containing the data 
you want to read, but the .NET Base64 support doesn't know what to do with  
a raw string like that).

You can address those problems reasonably easily by using a third-party  
Base64 implementation, or writing your own.  But I wanted to see if I  
could get a "pure .NET" solution.  Turns out, I could, even if it is a  
little awkward.

For relatively small sizes of data, I think it should work just fine  
though.  You probably don't want to be embedding hundreds or thousands of  
megabytes into your XML file anyway.  Base64 will inflate your data some  
20-30%, which is fine for small chunks of data but can get to be a real  
performance and disk space problem for very large data.

Pete

I am a little bit lost.
Shouldn't I encode the file using Base64? Then I get a string, right?
And then I can use Linq to save that String.
For reading the process would be inverse.

The size of all the files I need to encode would never be larger then
20MB. So I think it would be ok.

I also find the following (maybe is helpful for what you described):
http://stackoverflow.com/questions/475421/base64-encode-a-pdf-in-c

I am still goggling for implementations to check the options out
there.
 
P

Peter Duniho

I am a little bit lost.
Shouldn't I encode the file using Base64? Then I get a string, right?
And then I can use Linq to save that String.
For reading the process would be inverse.

All that is basically what the code I posted does. It just represents the
input or output (as appropriate) as a XCData element (i.e. a
System.Xml.Linq object).
The size of all the files I need to encode would never be larger then
20MB. So I think it would be ok.

I also find the following (maybe is helpful for what you described):
http://stackoverflow.com/questions/475421/base64-encode-a-pdf-in-c

Right...that just does the byte/string conversion (i.e. isn't
XML-specific) but is basically the same idea. I think that I must not
have realized the Base64 support in the Convert class existed when I wrote
the code I posted, because it seems to me that using that for the
conversion is at least as efficient, if not more so, than the System.Xml
stuff my code uses (no need to copy the Base64 string).

I'll blame MSDN's lame search engine for my inability to find the Convert
class's Base64 support when I was looking for it. Though, admittedly...it
was some time ago that I wrote the code, so who knows what the real reason
I overlooked Convert is. Could be it was even on purpose for some reason
that I can't remember at the moment. :)

Pete
 
S

shapper

All that is basically what the code I posted does.  It just represents the  
input or output (as appropriate) as a XCData element (i.e. a  
System.Xml.Linq object).



Right...that just does the byte/string conversion (i.e. isn't  
XML-specific) but is basically the same idea.  I think that I must not  
have realized the Base64 support in the Convert class existed when I wrote  
the code I posted, because it seems to me that using that for the  
conversion is at least as efficient, if not more so, than the System.Xml  
stuff my code uses (no need to copy the Base64 string).

I'll blame MSDN's lame search engine for my inability to find the Convert 
class's Base64 support when I was looking for it.  Though, admittedly....it  
was some time ago that I wrote the code, so who knows what the real reason  
I overlooked Convert is.  Could be it was even on purpose for some reason  
that I can't remember at the moment.  :)

Pete

I came up with the following:

byte[] binaryData;
binaryData = new Byte
[product.BrochureFile.InputStream.Length];
long bytesRead = product.BrochureFile.InputStream.Read
(binaryData, 0, (int)product.BrochureFile.InputStream.Length);
product.BrochureFile.InputStream.Close();
string base64String = System.Convert.ToBase64String
(binaryData, 0, binaryData.Length);

product.BrochureFile is of type HttpPostedFileBase:
http://msdn.microsoft.com/en-us/library/system.web.httppostedfilebase.aspx

This is what I get from the form on the ASP.NET MVC application.

Then I just insert the base64String on the XML file using Linq.

What do you think?
 
P

Peter Duniho

I came up with the following:

byte[] binaryData;
binaryData = new Byte
[product.BrochureFile.InputStream.Length];
long bytesRead = product.BrochureFile.InputStream.Read
(binaryData, 0, (int)product.BrochureFile.InputStream.Length);
product.BrochureFile.InputStream.Close();
string base64String = System.Convert.ToBase64String
(binaryData, 0, binaryData.Length);

product.BrochureFile is of type HttpPostedFileBase:
http://msdn.microsoft.com/en-us/library/system.web.httppostedfilebase.aspx

This is what I get from the form on the ASP.NET MVC application.

Then I just insert the base64String on the XML file using Linq.

What do you think?

Seems fine to me. The use of the stream from another object, where you
close the stream but not the containing object, seems a little odd to me.
But maybe that's just a web-application thing that I don't know about. As
far as the rest goes, for reasonably small files (and 20MB is probably
small enough), I think it's okay. You probably wouldn't want to try to
buffer the entire file contents for very large files, but it doesn't sound
that's a problem for you.

Pete
 
S

shapper

The use of the stream from another object, where you  
close the stream but not the containing object, seems a little odd to me. 

Do you suggest any change or improvement?

I created the following:

public static String ToBase64(HttpPostedFileBase file) {
Stream stream = file.InputStream;
Byte[] bytes = new Byte[stream.Length];
Int64 data = file.InputStream.Read(bytes, 0, (Int32)
stream.Length);
stream.Close();
return Convert.ToBase64String(bytes, 0, bytes.Length);
} // ToBase64

Then I am having a few problems with the FromBase64. I have the
following:

public static HttpPostedFileBase FromBase64(String file, String
filename) {

StreamReader stream = new StreamReader(filename,
Encoding.ASCII);
Char[] chars = new Char[stream.BaseStream.Length];
stream.Read(chars, 0, (Int32)stream.BaseStream.Length);
String data = new String(chars);

HttpPostedFileBase output;
// output.InputStream = ???;

} // FromBase64

This made sense to me but then I got lost on how to create the
HttpPostedFileBase.
I though of using the output.InputStream = stream but InputStream is
read only.

However output.InputStream has the usual methods and properties like
Read, BeginRead, BeginWrite, Seek, Position, Length, ...

I think this is the way but I am not completely sure the correct way
to do this.
 
S

shapper

And there is something else I wonder about:

I am converting a file to a Base64 string and save it on the XML file.
When doing the inverse conversion to I need to know the content type?
Id it is a pdf, and image, etc? Maybe save the MimeType on the XML to?
Or this is not necessary?

Thanks,
Miguel
 
P

Peter Duniho

Do you suggest any change or improvement?

Unfortunately, I don't know anything about the specific domain you're
doing this in. The "HttpPostedFileBase" class isn't anything I've used,
so I can't say whether there's actually a problem closing that stream,
never mind what the fix would be if there is.
[...]
Then I am having a few problems with the FromBase64. I have the
following:

public static HttpPostedFileBase FromBase64(String file, String filename)
{

StreamReader stream = new StreamReader(filename, Encoding.ASCII);

Are you sure you wrote ASCII characters to the file? Unfortunately, your
two methods are not direct inverses of each other, so I have no way to
confirm that the output encoding is the same as the encoding you're
reading here. Hopefully it is. If not, that's a problem.
Char[] chars = new Char[stream.BaseStream.Length];
stream.Read(chars, 0, (Int32)stream.BaseStream.Length);

At this point, you should call Convert.FromBase64CharArray() to convert
your "chars" array to a byte[]. Then you can use that byte[] to recreated
your "HttpPostedFileBase", however that works (as I said, I don't know
anything about that part of the question).

Though, looking at the docs for HttpPostedFileBase, it looks to me as
though it wouldn't even make sense to create an HttpPostedFileBase
instance from your Base64 encoded data. That class appears to be only for
files posted back to the server from the client; ultimately, the server
would be doing something specific with the file. In this case, it appears
that "specific" thing is to Base64 encode the file and save the data
somewhere.

But when you read the data back out, presumably it's for some purpose
other than to emulate a client posting data to the server. So you really
want your byte[] moved into something else that's more appropriate. For
example, if you are sending data back to the client somehow, you'd write
those bytes to whatever object (probably a Stream of some sort) that is
encapsulating the connection to the client.

I think that as far as that part of the question goes, you probably ought
to post the question to the ASP.NET newsgroup. Converting from Base64
back to bytes is easy, but knowing what to do with those bytes is not, at
least not for someone that hasn't done any real ASP.NET programming.
Looking at the ASP.NET docs, it looks like System.Web.HttpResponse might
be a starting place, but I can't really be sure.

Pete
 
P

Peter Duniho

And there is something else I wonder about:

I am converting a file to a Base64 string and save it on the XML file.
When doing the inverse conversion to I need to know the content type?
Id it is a pdf, and image, etc? Maybe save the MimeType on the XML to?
Or this is not necessary?

That all depends on how you're going to use it. Do you even have the MIME
type when the client posts the file back to you? If so, you could store
the MIME type and (I assume) include that as part of the response. If
not, then just let the client figure it out from the extension.

That all assumes that the scenario is:

-- client sends you a file
-- you store the file
-- you retrieve the file
-- you send file back to the client

I.e. there's an obvious "client-to-server-to-client", exactly inverted
process going on. If not, then without knowing exactly how you _are_
using the data, it's very difficult to comment on the best way to use it.

Pete
 
S

shapper

But when you read the data back out, presumably it's for some purpose
other than to emulate a client posting data to the server.  So you really
want your byte[] moved into something else that's more appropriate.  For
example, if you are sending data back to the client somehow, you'd write
those bytes to whatever object (probably a Stream of some sort) that is
encapsulating the connection to the client.

Peter,

The problem here is not so much with ASP.NET. I will explain:

On my model Product I have a property named Brochure.
In Brochure I need to save the file ... using byte[], Stream, etc.
This is the decision I need to take.

From the upload form I get and HttpPostedFileBase that contains its
InputStream with the usual methods.

When I need to return a file to the user I can return:

A FileStreamResult (Uses a stream to return the file. I also need
to know the ContentType)
http://msdn.microsoft.com/en-us/library/system.web.mvc.filestreamresult_members.aspx

A FileContentResult (Uses a byte[] to return the file. I also need
to know the ContentType)
http://msdn.microsoft.com/en-us/library/system.web.mvc.filecontentresult_members.aspx

A FilePathResult (Uses a path to return the file. I also need to
know the ContentType)
http://msdn.microsoft.com/en-us/library/system.web.mvc.filepathresult_members.aspx

So I think in my model I could use Stream as the brochure type. What
do you think?

So I need to methods:

1. Convert a Stream to a Base64 string:
(The Brochure Stream would be the HttpPostedFileBase.InputStream
of the posted file)
Now I could save the Base64 string in a XML file.

2. Convert a Base64 string to a Stream:
(Now with the Brochure Stream defined I could return a
FileStreamResult)

These are the two methods I am trying to create.
Another problem I have is that it seems I need to defined the
ContentType.
Is that included on a Stream or Byte[]?

I've posted this problem on an ASP.NET MVC forum but until now I
didn't get anything more than what I was able to create.

Thanks,
Miguel
 
S

shapper

I came up with the following:

public static String ToBase64(Stream stream) {
Byte[] bytes = new Byte[stream.Length];
Int64 data = stream.Read(bytes, 0, (Int32)stream.Length);
stream.Close();
return Convert.ToBase64String(bytes, 0, bytes.Length);
} // ToBase64

public static Stream FromBase64(String base64) {
Byte[] bytes = Convert.FromBase64String(base64);
Stream stream = new MemoryStream(bytes);
stream.Close();
return stream;
} // FromBase64

Is this the way to go?
I am not sure if I should do the conversion in Blocks but the files I
have will never be bigger than 1MB.
But even so ...
 
S

shapper

if the Microsoft team responsible for this area of  
technology can't be bothered to properly document the technology, it  
stands to reason they also can't be bothered to provide a clean, usable  
API.

The documentation for ASP.NET MVC is not very good. Don't ask me
why ...

I predict I will need to know the ContentType because I might need
some model validation, like file size, file type (I am using MIME
Types for that).
In fact I am already doing Fluent Validation with the
HttpPostFileBase ...
It will also help me if I will need to filter documents by type and so
on ... since the stream does not have any information about it.

So I created the following:

public class FileBase {

public Stream Content { get; set; }
public String Type { get; set; }
public Int32 Size { get; set; }

public FileBase(Stream content, Int32 size, String type) {
Content = content;
Type = type;
Size = size;
}
public FileBase(HttpPostedFileBase file) : this(file.InputStream,
file.ContentLength, file.ContentType) {
}
} // FileBase

Then I save this information on a XML file or Database table named
Files.
Now I am able to filter, sort by size and so on.

That way every time I load a record, e.g. Product, I get also the file
or files related to it.

So I will get a Product object with all Product details and its files.

Sorry maybe my posts have been confusing but I can't find anything
made like this so I am exploring to get to the best solution possible.
 
J

Joe Fawcett

Two points:
* Why do you feel the need to save as XML type, why not save the blob
directly? (I may have missed the rationale for this in the long thread)
* If you do save as XML it's easy to have an attribute in the XML to specify
content/type if they may be different.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top