Best way for reading HTTP data

N

Nuno Magalhaes

I've got a problem which relates to reading HTTP data.
I've got the socket connected to a web site and then I send "GET /
HTTP/1.1\n\n" and the page is received after a while but not all of the
page. Should I implement a timer to read the web page? How do I know
when the page is completed if sometimes socket.Available is 0?

The procedure is as follows:
-Socket socket=new
Socket(AddressFamily.InterNetwork,SocketType.Stream,ProtocolType.Tcp);
-socket.Connect(endpoint);
-byte[] msg=Encoding.UTF8.GetBytes("GET / HTTP/1.1\n\n");
byte[] bytes=new byte[65536];
int i=socket.Send(msg,0,msg.Length,SocketFlags.None);
MessageBox.Show("Sent "+i.ToString()+" bytes. Available:
"+socket.Available.ToString()+" bytes.");
socket.Receive(bytes,0,socket.Available,SocketFlags.None);
TrafficLogTextBox.Text+=Encoding.UTF8.GetString(bytes);
TrafficLogTextBox.Text+="\r\n";
MessageBox.Show(Encoding.UTF8.GetString(bytes));

How does HTTPWebResponse implements this? Does it use a timer between
non receiving data times? *How do I know when the page is complete?*
Did I made myself clear?

Thanks a lot,
Nuno Magalhaes.
 
V

Vadym Stetsyak

Web page can have large size, that is why it is normal situation that it
will be received with the help of several calls to Receive(...).
To handle this you have to parse HTTP protocol specific data. Size of the
response that server will generate is written into content-size http header.

So the algorithm is the following:
- receive the first part of the response, that contains http header, that
will describe the data that will follow;
- receive the amount of data, specified in the retrieved http header
 
N

Nuno Magalhaes

In most cases I don't have the "Content-Length" field in the HTTP
response header.
Any hints for what I could be doing wrong or what I should be doing.

Thank you Vadym.

Vadym said:
Web page can have large size, that is why it is normal situation that it
will be received with the help of several calls to Receive(...).
To handle this you have to parse HTTP protocol specific data. Size of the
response that server will generate is written into content-size http header.

So the algorithm is the following:
- receive the first part of the response, that contains http header, that
will describe the data that will follow;
- receive the amount of data, specified in the retrieved http header

--
Vadym Stetsyak aka Vadmyst

http://vadmyst.blogspot.com
Nuno Magalhaes said:
I've got a problem which relates to reading HTTP data.
I've got the socket connected to a web site and then I send "GET /
HTTP/1.1\n\n" and the page is received after a while but not all of the
page. Should I implement a timer to read the web page? How do I know
when the page is completed if sometimes socket.Available is 0?

The procedure is as follows:
-Socket socket=new
Socket(AddressFamily.InterNetwork,SocketType.Stream,ProtocolType.Tcp);
-socket.Connect(endpoint);
-byte[] msg=Encoding.UTF8.GetBytes("GET / HTTP/1.1\n\n");
byte[] bytes=new byte[65536];
int i=socket.Send(msg,0,msg.Length,SocketFlags.None);
MessageBox.Show("Sent "+i.ToString()+" bytes. Available:
"+socket.Available.ToString()+" bytes.");
socket.Receive(bytes,0,socket.Available,SocketFlags.None);
TrafficLogTextBox.Text+=Encoding.UTF8.GetString(bytes);
TrafficLogTextBox.Text+="\r\n";
MessageBox.Show(Encoding.UTF8.GetString(bytes));

How does HTTPWebResponse implements this? Does it use a timer between
non receiving data times? *How do I know when the page is complete?*
Did I made myself clear?

Thanks a lot,
Nuno Magalhaes.
 
T

tdavisjr

I think you are taking the incorrect approach here. You should use the
HTTPWebRequest and HTTPWebResponse classes. They are much, much easier
than raw sockets. Here is some sample code you can use to start with:

Sub Main()
Dim objRequest As HttpWebRequest
Dim strRequest As String
Dim objResponse As HttpWebResponse
Dim srResponse As StreamReader
Dim strUrl As String = "http://www.msn.com"


'initialize the request
objRequest = CType(WebRequest.Create(strUrl), HttpWebRequest)
objRequest.Method = "GET"

'get response
objResponse = CType(objRequest.GetResponse, HttpWebResponse)
srResponse = New StreamReader(objResponse.GetResponseStream)
Console.WriteLine(srResponse.ReadToEnd)
srResponse.Close()
Console.ReadLine()
End Sub

Its in VB.NET but you should be able to convert this quite easily. To
answer your question how do I know when the reading is done; as you can
see calling the ReadToEnd() method on the streamreader object handles
this for you.

I hope this helps
 
N

Nuno Magalhaes

Maybe I'm not passing all the parameters to the server also. Do you
know if sending "GET / HTTP/1.1\n\n" is enough to receive the content
length field?
 
N

Nuno Magalhaes

I can't use that higher level functions because I'm measuring
parameters of QoS such as: time to resolve dns, time to connect, time
to receive data, time to display all web page, etc...

Do you know if the "GET / HTTP/1.1" is enough to receive the
"Content-Length: " parameter in the HTTP response header?

Thank you.
 
T

tdavisjr

Ok. I feld kinda bad posting VB code in a C# newsgroup as I forgot what
group I was in, so I will convert this on the fly for you. Sorry about
that Guys! My bad!

static void main(string[] args)
{
HttpWebRequest objRequest;
string strRequest;
HttpWebResponse objResponse;
StreamReader srResponse;
string strUrl = "http://www.msn.com";
objRequest = ((HttpWebRequest) (WebRequest.Create(strUrl)));
objRequest.Method = "GET";
objResponse = ((HttpWebResponse) (objRequest.GetResponse));
srResponse = new StreamReader(objResponse.GetResponseStream);
Console.WriteLine(srResponse.ReadToEnd);
srResponse.Close();
Console.ReadLine();
}
 
T

tdavisjr

The GET should do it; but I would use the HEAD as it only retreive the
headers

HEAD / HTTP/1.1 \r\n
Host: localhost (or whatever)\r\n

Note the \r\n instead of \n\n

\r\n = carriage return/line feed
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top