ContentLength = -1 with HttpWebResponse() ??

G

Guest

In .NetCF, upon trying to access the following URL:

http://rss.news.yahoo.com/rss/business

via the following code:
WebRequest webReq = WebRequest.Create(url);
WebResponse webResp = webReq.GetResponse();
StreamReader sr = new StreamReader(webResp.GetResponseStream(),
Encoding.GetEncoding("utf-8"));
char [] buffer = new char[1024];
int count = 0;
while ((count = sr.Read(buffer, 0, buffer.Length)) > 0)
{
// do something
}

I get an exception when I try to sr.Read() - chunk size is invalid. I think
I traced it down to this: the ContentLength of the webResp object is -1. If
I try this with other web sites that return a Content-Length header, it
works fine.

Is there a way to have the Reading of the stream not depend on this ? I'd
figure that the library would be forgiving about this, since not every web
application knows the length of the content before sending it out.

Any responses appreciated.

Thanks,
Jay
 
J

Joerg Jooss

In .NetCF, upon trying to access the following URL:

http://rss.news.yahoo.com/rss/business

via the following code:
WebRequest webReq = WebRequest.Create(url);
WebResponse webResp = webReq.GetResponse();
StreamReader sr = new StreamReader(webResp.GetResponseStream(),
Encoding.GetEncoding("utf-8"));
char [] buffer = new char[1024];
int count = 0;
while ((count = sr.Read(buffer, 0, buffer.Length)) > 0)
{
// do something
}

I get an exception when I try to sr.Read() - chunk size is invalid.
I think I traced it down to this: the ContentLength of the webResp
object is -1. If I try this with other web sites that return a
Content-Length header, it works fine.

Is there a way to have the Reading of the stream not depend on this ?
I'd figure that the library would be forgiving about this, since not
every web application knows the length of the content before sending
it out.

I'm just getting started with the CF, but in the standard framework, I
prefer to avoid StreamReader in these situation. Instead, I process the raw
stream and decode it afterwards manually if appropriate; thus, the same
method can download any ressource, not just text.

// This sample uses a 4 kB read buffer and a 64 kB memory buffer.
string responseText = null;
WebRequest request = WebRequest.Create(uri);
using (Stream responseStream = request.GetResponse().GetResponseStream()) {
MemoryStream memoryStream = new MemoryStream(0x10000);
byte[] buffer = new byte[0x1000];
int bytes;
while ((bytes = responseStream.Read(buffer, 0, buffer.Length)) > 0) {
memoryStream.Write(buffer, 0, bytes);
}
reponseText = Encoding.UTF8.GetString(memoryStream.ToArray());
}

Cheers,
 
G

Guest

Well, it looks like the problem is not because it's a StreamReader, the same
would happen if I replaced the StreamWriter calls with:

Stream webStream = webResp.GetResponseStream();
webStream.Read(buffer, 0, buffer.Length);

So I don't think the MemoryStream trick will work (although thanks for the
tip :) ).

Have you tried it with the URL I passed in? I'll try it myself too.

Thanks,
Jay


Joerg Jooss said:
In .NetCF, upon trying to access the following URL:

http://rss.news.yahoo.com/rss/business

via the following code:
WebRequest webReq = WebRequest.Create(url);
WebResponse webResp = webReq.GetResponse();
StreamReader sr = new StreamReader(webResp.GetResponseStream(),
Encoding.GetEncoding("utf-8"));
char [] buffer = new char[1024];
int count = 0;
while ((count = sr.Read(buffer, 0, buffer.Length)) > 0)
{
// do something
}

I get an exception when I try to sr.Read() - chunk size is invalid.
I think I traced it down to this: the ContentLength of the webResp
object is -1. If I try this with other web sites that return a
Content-Length header, it works fine.

Is there a way to have the Reading of the stream not depend on this ?
I'd figure that the library would be forgiving about this, since not
every web application knows the length of the content before sending
it out.

I'm just getting started with the CF, but in the standard framework, I
prefer to avoid StreamReader in these situation. Instead, I process the raw
stream and decode it afterwards manually if appropriate; thus, the same
method can download any ressource, not just text.

// This sample uses a 4 kB read buffer and a 64 kB memory buffer.
string responseText = null;
WebRequest request = WebRequest.Create(uri);
using (Stream responseStream = request.GetResponse().GetResponseStream()) {
MemoryStream memoryStream = new MemoryStream(0x10000);
byte[] buffer = new byte[0x1000];
int bytes;
while ((bytes = responseStream.Read(buffer, 0, buffer.Length)) > 0) {
memoryStream.Write(buffer, 0, bytes);
}
reponseText = Encoding.UTF8.GetString(memoryStream.ToArray());
}

Cheers,
 
J

John Saunders

In .NetCF, upon trying to access the following URL:

http://rss.news.yahoo.com/rss/business

via the following code:
WebRequest webReq = WebRequest.Create(url);
WebResponse webResp = webReq.GetResponse();
StreamReader sr = new StreamReader(webResp.GetResponseStream(),
Encoding.GetEncoding("utf-8"));
char [] buffer = new char[1024];
int count = 0;
while ((count = sr.Read(buffer, 0, buffer.Length)) > 0)
{
// do something
}

I get an exception when I try to sr.Read() - chunk size is invalid. I think
I traced it down to this: the ContentLength of the webResp object is -1. If
I try this with other web sites that return a Content-Length header, it
works fine.

I've had the same problem in the full Framework when a site returns -1 for
Content-Length. This is apparently a sign that chunked transfer-mode will be
used, and I never found a way to get the content in that case.

Now, I was able to live with that, but if you can't, then perhaps you can
send a header in your request which would prohibit the use of chunked
transfer. I'm afraid I don't know which header you'd have to use.
 
J

Joerg Jooss

Well, it looks like the problem is not because it's a StreamReader,
the same would happen if I replaced the StreamWriter calls with:

Stream webStream = webResp.GetResponseStream();
webStream.Read(buffer, 0, buffer.Length);

So I don't think the MemoryStream trick will work (although thanks
for the tip :) ).

Have you tried it with the URL I passed in? I'll try it myself too.

It works fine, but http://rss.news.yahoo.com/rss/business doesn't use UTF-8.
The RRS feed uses ISO-8859-1 -- see its XML declation. Adopt my code like
this:

Encoding iso88591 = Encoding.GetEncoding("ISO-8859-1");
reponseText = iso88591 .GetString(memoryStream.ToArray());

and it works fine.

Cheers,

--
Joerg Jooss
(e-mail address removed)















HTTP/1.1 200 OK
Date: Thu, 27 May 2004 18:06:07 GMT
P3P: policyref="http://p3p.yahoo.com/w3c/p3p.xml", CP="CAO DSP COR CUR ADM
DEV TAI PSA PSD IVAi IVDi CONi TELo OTPi OUR DELi SAMi OTRi UNRi PUBi IND
PHY ONL UNI PUR FIN COM NAV INT DEM CNT STA POL HEA PRE GOV"
Connection: close
Content-Type: text/xml
<?xml version="1.0" encoding="iso-8859-1" ?>
<rss version="2.0">
<channel>
<title>Yahoo! News - Business</title>
<link>http://news.yahoo.com/news?tmpl=index&amp;cid=1885</link>
<description>Yahoo! News - Business</description>
<language>en-us</language>
<lastBuildDate>Thu, 27 May 2004 17:54:08 GMT</lastBuildDate>
<ttl>5</ttl>
<image>
<title>Yahoo! News</title>
<width>142</width>
<height>18</height>
<link>http://news.yahoo.com/</link>
<url>http://us.i1.yimg.com/us.yimg.com/i/us/nws/th/main_142.gif</url>
</image>
<item>
<title>GDP Bumped Up Amid Slim Profit Rise (Reuters)</title>

<link>http://us.rd.yahoo.com/dailynews/rss/business/*http://story.news.yahoo
..com/news?tmpl=story2&amp;u=/nm/20040527/bs_nm/economy_dc</link>
<guid isPermaLink="false">nm/20040527/economy_dc</guid>
<pubDate>Thu, 27 May 2004 16:06:28 GMT</pubDate>
<description>Reuters - The U.S. economy grew a bit more
quickly in the first quarter than first estimated as businesses
scrambled to restock, while a revival in corporate profits
slowed, the government said on Thursday.</description>
</item>
<item>
<title>Stocks Buoyed by Dip in Oil Price, GDP (Reuters)</title>

<link>http://us.rd.yahoo.com/dailynews/rss/business/*http://story.news.yahoo
..com/news?tmpl=story2&amp;u=/nm/bs_nm/markets_stocks_dc</link>
<guid isPermaLink="false">nm/20040527/markets_stocks_dc</guid>
<pubDate>Thu, 27 May 2004 16:39:38 GMT</pubDate>
<description>Reuters - Blue-chips were higher on Thursday, as
a drop in oil prices below &amp;#36;40 fueled investor optimism and a
report showed U.S. gross domestic product grew a touch faster
in the first quarter than previously thought as businesses
scrambled to restock depleted shelves.</description>
</item>
<item>
<title>Rite Aid Ex-CEO Sentenced to Eight Years (Reuters)</title>

<link>http://us.rd.yahoo.com/dailynews/rss/business/*http://story.news.yahoo
..com/news?tmpl=story2&amp;u=/nm/20040527/bs_nm/retail_riteaid_dc</link>
<guid isPermaLink="false">nm/20040527/retail_riteaid_dc</guid>
<pubDate>Thu, 27 May 2004 17:11:13 GMT</pubDate>
<description>Reuters - The former chief executive of Rite Aid
Corp. (RAD.N) was sentenced on Thursday to eight years in
prison for his involvement in a &amp;#36;1.6 billion accounting scandal
that paralyzed the No. 3 U.S. drugstore chain four years ago.</description>
</item>
<item>
<title>Costco Profit Beats Expectations (Reuters)</title>

<link>http://us.rd.yahoo.com/dailynews/rss/business/*http://story.news.yahoo
..com/news?tmpl=story2&amp;u=/nm/20040527/bs_nm/retail_costco_earns_dc</link>
<guid isPermaLink="false">nm/20040527/retail_costco_earns_dc</guid>
<pubDate>Thu, 27 May 2004 17:37:01 GMT</pubDate>
<description>Reuters - Costco Wholesale Corp. (COST.O) on
Thursday posted a better-than-expected 29 percent jump in
quarterly profit, sending its stock higher, thanks to strong
spring sales and tighter cost controls.</description>
</item>
<item>
<title>Tyco Says It Will Pay Off Debt (Reuters)</title>

<link>http://us.rd.yahoo.com/dailynews/rss/business/*http://story.news.yahoo
..com/news?tmpl=story2&amp;u=/nm/20040527/bs_nm/manufacturing_tyco_dc</link>
<guid isPermaLink="false">nm/20040527/manufacturing_tyco_dc</guid>
<pubDate>Thu, 27 May 2004 16:53:21 GMT</pubDate>
<description>Reuters - Conglomerate Tyco International Ltd.
(TYC.N) on Thursday said it would pay off debt over the next
several quarters, taking another step toward shedding the
problems inherited from the reign of former chief Dennis
Kozlowski.</description>
</item>
<item>
<title>Economy Picks Up Pace in First Quarter (AP)</title>

<link>http://us.rd.yahoo.com/dailynews/rss/business/*http://story.news.yahoo
..com/news?tmpl=story2&amp;u=/ap/20040527/ap_on_bi_go_ec_fi/economy</link>
<guid isPermaLink="false">ap/20040527/economy</guid>
<pubDate>Thu, 27 May 2004 17:54:08 GMT</pubDate>
<description>AP - The economy grew at a 4.4 percent annual rate in the first
quarter of this year, slightly faster than previously thought and fresh
evidence that the recovery possessed good momentum as it headed into the
current quarter.</description>
</item>
<item>
<title>Stocks Buoyed by Dip in Oil Price, GDP (Reuters)</title>

<link>http://us.rd.yahoo.com/dailynews/rss/business/*http://story.news.yahoo
..com/news?tmpl=story2&amp;u=/nm/bs_nm/markets_stocks_dc</link>
<guid isPermaLink="false">nm/20040527/markets_stocks_dc</guid>
<pubDate>Thu, 27 May 2004 16:39:38 GMT</pubDate>
<description>Reuters - Blue-chips were higher on Thursday, as
a drop in oil prices below &amp;#36;40 fueled investor optimism and a
report showed U.S. gross domestic product grew a touch faster
in the first quarter than previously thought as businesses
scrambled to restock depleted shelves.</description>
</item>
<item>
<title>Wells Fargo Snaps Up Strong's Assets (AP)</title>

<link>http://us.rd.yahoo.com/dailynews/rss/business/*http://story.news.yahoo
..com/news?tmpl=story2&amp;u=/ap/20040527/ap_on_bi_ge/wells_fargo_strong</lin
k>
<guid isPermaLink="false">ap/20040527/wells_fargo_strong</guid>
<pubDate>Thu, 27 May 2004 12:49:18 GMT</pubDate>
<description>AP - Banking giant Wells Fargo &amp;amp; Co. snapped up the
scandal-scarred mutual fund business of Strong Financial, hoping to repair
the damage caused by a fiasco that prompted peeved shareholders to sell
their holdings.</description>
</item>
<item>
<title>British retail kingpin mulls bid for Marks and Spencer (AFP)</title>

<link>http://us.rd.yahoo.com/dailynews/rss/business/*http://story.news.yahoo
..com/news?tmpl=story2&amp;u=/afp/20040527/bs_afp/britain_retail_company</lin
k>
<guid isPermaLink="false">afp/20040527/britain_retail_company</guid>
<pubDate>Thu, 27 May 2004 17:48:14 GMT</pubDate>
<description>AFP - British retail tycoon Philip Green said he was preparing
a possible takeover offer for Marks and Spencer.</description>
</item>
<item>
<title>Useful Lesson From the Past (Forbes.com)</title>

<link>http://us.rd.yahoo.com/dailynews/rss/business/*http://story.news.yahoo
..com/news?tmpl=story2&amp;u=/fo/20040526/bs_fo/58335fb34a5583f47754152929d05
f1c</link>
<guid
isPermaLink="false">fo/20040526/58335fb34a5583f47754152929d05f1c</guid>
<pubDate>Wed, 26 May 2004 17:32:28 GMT</pubDate>
<description>Forbes.com - The short, easy and wrong solution to the problems
in Iraq is to turn them all over to the United Nations and urge other
countries to help. This basically is Senator John Kerry's response to
any &quot;What would you do?&quot; questions. He and others who criticize
President Bush for going to war without the UN's permission and the
support of the international community offer only this egregiously useless
solution. The ignorance, misconceptions and faulty judgment displayed in
such thinking are appalling.</description>
</item>
</channel>
</rss>
 
J

Joerg Jooss

John said:
In .NetCF, upon trying to access the following URL:

http://rss.news.yahoo.com/rss/business

via the following code:
WebRequest webReq = WebRequest.Create(url);
WebResponse webResp = webReq.GetResponse();
StreamReader sr = new StreamReader(webResp.GetResponseStream(),
Encoding.GetEncoding("utf-8"));
char [] buffer = new char[1024];
int count = 0;
while ((count = sr.Read(buffer, 0, buffer.Length)) > 0)
{
// do something
}

I get an exception when I try to sr.Read() - chunk size is invalid.
I think I traced it down to this: the ContentLength of the webResp
object is -1. If I try this with other web sites that return a
Content-Length header, it works fine.

I've had the same problem in the full Framework when a site returns
-1 for Content-Length. This is apparently a sign that chunked
transfer-mode will be used, and I never found a way to get the
content in that case.


That's almost certainly caused by using the wrong character encoding. Once
the decoder sees an invalid byte sequence, it just stops. Thus, the web
resposne is corrupted.

Also note that a missing Content-Length header does *not* imply
"Transfer-Encoding: chunked", but "Connection: close" for a well behaving
HTTP 1.1 server. http://rss.news.yahoo.com/rss/business simply sends
"Connection: close".

Cheers,
 
G

Guest

Thanks for your advice. I just tried this out, and I saw that the data I'm
getting is chunked. Strangely, if I try this from my office, the data
returned is not chunked, but doing this from home, I get a
"Transfer-Encoding: chunked" header. There is no way to get the data if it
is chunked?

Joerg Jooss said:
John said:
In .NetCF, upon trying to access the following URL:

http://rss.news.yahoo.com/rss/business

via the following code:
WebRequest webReq = WebRequest.Create(url);
WebResponse webResp = webReq.GetResponse();
StreamReader sr = new StreamReader(webResp.GetResponseStream(),
Encoding.GetEncoding("utf-8"));
char [] buffer = new char[1024];
int count = 0;
while ((count = sr.Read(buffer, 0, buffer.Length)) > 0)
{
// do something
}

I get an exception when I try to sr.Read() - chunk size is invalid.
I think I traced it down to this: the ContentLength of the webResp
object is -1. If I try this with other web sites that return a
Content-Length header, it works fine.

I've had the same problem in the full Framework when a site returns
-1 for Content-Length. This is apparently a sign that chunked
transfer-mode will be used, and I never found a way to get the
content in that case.


That's almost certainly caused by using the wrong character encoding. Once
the decoder sees an invalid byte sequence, it just stops. Thus, the web
resposne is corrupted.

Also note that a missing Content-Length header does *not* imply
"Transfer-Encoding: chunked", but "Connection: close" for a well behaving
HTTP 1.1 server. http://rss.news.yahoo.com/rss/business simply sends
"Connection: close".

Cheers,
 
J

Joerg Jooss

Thanks for your advice. I just tried this out, and I saw that the
data I'm getting is chunked. Strangely, if I try this from my
office, the data returned is not chunked, but doing this from home, I
get a "Transfer-Encoding: chunked" header. There is no way to get
the data if it is chunked?

Interesting. For me, it's not chunked. But the code I've posted should work
just fine with chunking.

Cheers,
 
J

JO

I think I figured out why it's not chunked from my office - at work we have
a proxy (Microsoft ISA I believe) that caches web pages and serves them up.
Whereas at home I don't have that, so I get the raw data. Have something
similar where you are?

I'll re-try your code at home. I think chunking just isn't supported :(
 
J

Joerg Jooss

JO said:
I think I figured out why it's not chunked from my office - at work
we have a proxy (Microsoft ISA I believe) that caches web pages and
serves them up. Whereas at home I don't have that, so I get the raw
data. Have something similar where you are?

I'll re-try your code at home. I think chunking just isn't supported
:(

Believe me, chunking is no problem. And I don't see chunked responses from
Yahoo ;-)

Cheers,
 
G

Guest

argh! :). I think you're right!

Although that other poster was stating that he couldn't figure out how to
read chunked files. Something could be corrupting the chunks?
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top