How do I get COMPLETE response from HttpWebResponse using StreamReader???

  • Thread starter Thread starter Hexman
  • Start date Start date
H

Hexman

Code below ----

I've asked a similar question on this forum earlier. This is a slightly different situation.

Previous Question ----

I'm trying to save some specific web pages to disk as text files. I searched the Internet and found a basic example which I changed to fit my needs.
I tested it out first on a single URL and it worked fine. Now when I incorporate an array of URL's, it fails to work. The first "responseFromServer"
properly retrieves and displays "http://finance.yahoo.com" (or any other valid URL for the first webrequest.create), but when I use the urls from the
array in the for/next structure, I never get a responseFromServer. My guess is that there must be some kind of initialization to the WebRequest,
Stream, or WebResponse.

Current Question ----

After changing some code, I now get responses returned to me. BUT, the pages I am going after have JavaScript in them to build the retrieve data,
tables, layout, etc. What I get back is the JavaScript code but none of the data I'm looking for. I thought the problem was an initialization of
some object, but now I believe that the actual page is not being built by the time my program is ready for the streamReader. I may be all wrong on
that assumption, but I need someone to steer me in the right direction.

Even checking EndOfStream doesn't wait for the page to be built.

Thanks,

Hexman

P.S. I can run the program repeatedly and on some runs, SOME of the pages are comeplete! So to me that points me to the building of the requested
pages.



Newer Code ----
--------------------------------------------------------------------------------------------
This code replaces the processing of the Old Code. Changes: Try-Catch, ReadLine
rather than ReadToEnd, URL from a datatable. None of these changes should affect the
problem area of the program.

Try
Dim request As HttpWebRequest = WebRequest.Create(URL)
Dim response As HttpWebResponse = request.GetResponse()
Dim reader As StreamReader = New StreamReader(response.GetResponseStream())
strToWrite = ""
Do While reader.EndOfStream = False
str = reader.ReadLine()
strToWrite = strToWrite & str
Loop

strOutFile = OutFN(Idx)
File.WriteAllText(strOutFile, strToWrite)
reader.Close()
response.Close()

Catch ex As WebException
Console.WriteLine(ex.Message)
If ex.Status = WebExceptionStatus.ProtocolError Then
Console.WriteLine("Status Code : {0}", CType(ex.Response, HttpWebResponse).StatusCode)
Console.WriteLine("Status Description : {0}", CType(ex.Response, HttpWebResponse).StatusDescription)
End If
Catch ex As Exception
Console.WriteLine(ex.Message)
End Try
End If

Old Code ----
--------------------------------------------------------------------------------------------
Imports System
Imports System.IO
Imports System.Net
Imports System.Net.WebClient
Imports System.Windows.Forms
Public Class Form1

Private Sub btnProcess_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles btnProcess.Click
Dim x As Integer
Dim aryUrls(15) As String
aryUrls(0) = "http://finance.yahoo.com/q/bc?s=DOW&t=1d"
aryUrls(1) = "http://finance.yahoo.com/q/bc?s=IBM&t=6m"
aryUrls(2) = "http://finance.yahoo.com/q/bc?s=MSFT&t=6m"
aryUrls(3) = "http://biz.yahoo.com/top.html"

Dim request As WebRequest = WebRequest.Create("http://finance.yahoo.com/")
Dim response As WebResponse = request.GetResponse()
Dim dataStream As Stream = response.GetResponseStream()
Dim reader As New StreamReader(dataStream)
Dim responseFromServer As String = reader.ReadToEnd()

MsgBox(responseFromServer) <=== Shows finance.yahoo.com

For x = 0 To 3
request = WebRequest.Create(aryUrls(x))
dataStream = response.GetResponseStream()
responseFromServer = reader.ReadToEnd()
MsgBox(responseFromServer) <=== Shows nothing
Next

reader.Close()
response.Close()
End Sub
 
Hexman,

You be aware that many webpages don't exist from one document but from more
documents containing frames (with there own urls)

If you really need a complete page in HTML than in my idea fits the
webbrowser for that the best.

I hope this helps,

Cor.
 
Thanks, I'll read up on the webbrowser.

Hexman

Hexman,

You be aware that many webpages don't exist from one document but from more
documents containing frames (with there own urls)

If you really need a complete page in HTML than in my idea fits the
webbrowser for that the best.

I hope this helps,

Cor.
 
Cor,

I've looked into WebBrowser and that appears to be able to give me what I need. I tried it in code using textbox for displaying the html source and
webbrowser for the html using ONE URL. I've written the text to a file. All is well......EXCEPT.....when I use an array (actually a datatable)
containing the URLs I want to retrieve, they're processed so fast that the pages are not completely loaded before the next one starts to be navigated
to.

I've added a "WebBrowserDocumentCompletedEventHandler" and it does fire but not when I'm thinking it would.

I need some kind of delay mechinism until each page is completely loaded.

For this app I need to run the webbrowser procedurally (modal) so it doesn't come back to the next instruction until it has completed loading the
page. (it seems that when webbrowser navigate command is given, a new (independent thread) is started.

Is there a "Wait for Event" (WaitForEvent(Webbrowser.DocumentCompleted) or something like that?)

I'm almost over the edge,

Hexman
 
Back
Top