Problem downloading a file with HTTP

R

Ray Mitchell

Hello,

I'm trying to download a binary ZIP file using HTTP. Following an example
in MSDN and other sources, my code is:

WebClient client = new WebClient();
client.DownloadFile"http://www.website.com/BinaryFile.zip", "BinaryFile.zip");

Using FTP I can download the file fine, but HTTP downloads something else,
which is not what is in my file. Instead its content is something like the
following. It looks to me like some sort of status information is being
downloaded instead of the actual file:

<html>
<head>
<title>website.com: The Leading Storage Archive Site on the Net</title>
</head>
<frameset cols="1,*" border=0>
<frame name="top"
src="t.php?uid=ws1048efc603a93ed3.10806440&src=&cat=computers%2Finternet%2Fdownloads&kw=Storage+Archive&sc=storage+media"
scrolling=no frameborder=0 noresize framespacing=0 marginwidth=0
marginheight=0>
<frame src="search.php?uid=ws1048efc603a93ed3.10806440&src="
scrolling="auto" framespacing=0 marginwidth=0 marginheight=0 noresize>
</frameset>
<noframes>
This page requires frames.
</noframes>
</html>
 
R

Ray Mitchell

Ray Mitchell said:
Hello,

I'm trying to download a binary ZIP file using HTTP. Following an example
in MSDN and other sources, my code is:

WebClient client = new WebClient();
client.DownloadFile"http://www.website.com/BinaryFile.zip", "BinaryFile.zip");

Using FTP I can download the file fine, but HTTP downloads something else,
which is not what is in my file. Instead its content is something like the
following. It looks to me like some sort of status information is being
downloaded instead of the actual file:

<html>
<head>
<title>website.com: The Leading Storage Archive Site on the Net</title>
</head>
<frameset cols="1,*" border=0>
<frame name="top"
src="t.php?uid=ws1048efc603a93ed3.10806440&src=&cat=computers%2Finternet%2Fdownloads&kw=Storage+Archive&sc=storage+media"
scrolling=no frameborder=0 noresize framespacing=0 marginwidth=0
marginheight=0>
<frame src="search.php?uid=ws1048efc603a93ed3.10806440&src="
scrolling="auto" framespacing=0 marginwidth=0 marginheight=0 noresize>
</frameset>
<noframes>
This page requires frames.
</noframes>
</html>

So, as it turns out, the name of the file was incorrect and the named file
didn't exist. However, I did use a try/catch and just assumed that if the
file didn't exist I'd get an exception, which I didn't. What is the proper
way to approach this. Is there any way to get a directory listing or test if
a file actually exists before attempting a download? If I just do it like
I'm currently doing it, I assume I'll have to actually inspect the contents
of what did get downloaded to see if it's what I expect. This doesn't seem
like the correct approach. Thanks, Ray
 
R

Ray Mitchell

Peter Duniho said:
Not reliably, no. It would depend on the server, and the exact behavior
is not standardized as far as I know (I've seen different HTTP servers
return directory information in a variety of different ways).


I agree that inspecting the returned content itself is less than optimal.

It seems to me that you _should_ be able to somehow get the status code
for the HTTP response. This is the three-digit numeric value that's
returned in the very first response line, even before any headers are sent
from the server. You would be able to inspect the status code to
determine whether the retrieval was actually successful or not.

But even that depends on the HTTP server somewhat, in that some are
configured to return without error an HTML page when some failure occurs.

Assuming you have a server that _does_ set the status code correctly (it
could return an error page and still set the status code), once the
download has completed you should be able to call WebClient.GetResponse(),
cast the return value to HttpWebResponse, and look at the StatusCode
property.

I think that that's probably the most reliable way to detect an error, but
it does depend on the server being well-behaved. That said, that's true
for ALL network operations, so you should always be coding defensively in
any case. Assume at every point along the way that you might receive a
response other than what's valid.

Pete

Thanks Pete, as always. The more I consider this the more I think that FTP
is the right way to go. Ray
 
R

Ray Mitchell

Peter Duniho said:
It might be. Or it might not. It depends a lot on the use case.

FTP should definitely offer a less variable experience. But some networks
are set up to allow only HTTP traffic. You might consider providing the
user with the choice as to what approach they want to use.

Of course, if you have complete end-to-end control over the entire system,
then you could simply go with what works best for your purposes. But it
sounds like you don't.

Have you at least checked the error case that you're looking at right now
to see whether the StatusCode field is set accordingly? That would be a
useful data point for your decision-making.

Pete

Pete,

I haven't checked the status code yet, but before I put out any more effort
in the HTTP arena I'm still considering whether it's even appropriate for my
application, especially considering the difficulty in getting directory/file
listings with HTTP versus the ease of doing it with FTP, and I definitely
need such listings. I actually will have control over the end-to-end system
since I will be specifying what the users must have available.

My application uses an FTP client to upload thousands of files into a
complex directory configuration on a web server. Then there will be multiple
users that need to download some of them that they haven't already
downloaded. Ultimately I'll need to restrict certain users to certain
directories. I know I can do this in FTP by setting up separate
password-protected user subaccounts with limited access, but I haven't looked
into this possibility using HTTP. However, I keep coming back to the issues
involved in getting directory/file listings and FTP seems to be the clear
winner there.

Ray
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top