viewing the contents of a POST request

S

sklett

Hi all,

While this is not specific to asp.net, it does seem like the most
appropriate form (that I could find) to ask the following:

I'm trying to write a screen scraper that needs to submit a POST request,
then scrape the response. I've tried to determine the contents of the POST
via firebug but can't figure it out. I'm wondering if any of you could
recommend an application that will show exactly what is submitted when I
click Search on the following page:
https://nppes.cms.hhs.gov/NPPES/NPIRegistrySearch.do?subAction=reset&searchType=ind

I've never written a "scraper" before, so I'm not sure how it's done but I
imagine I would do something like this:
Once I manage to POST the form, I will read the response into some kind of
DOM object, then locate the elements I'm interested in and read their
innerHtml properties.

Anyway, that's how I imagine it will work. First hurdle: POSTing the form.

Thanks for any help,
Steve
 
A

Anthony Jones

sklett said:
Hi all,

While this is not specific to asp.net, it does seem like the most
appropriate form (that I could find) to ask the following:

I'm trying to write a screen scraper that needs to submit a POST request,
then scrape the response. I've tried to determine the contents of the POST
via firebug but can't figure it out. I'm wondering if any of you could
recommend an application that will show exactly what is submitted when I
click Search on the following page:
https://nppes.cms.hhs.gov/NPPES/NPIRegistrySearch.do?subAction=reset&searchType=ind

I've never written a "scraper" before, so I'm not sure how it's done but I
imagine I would do something like this:
Once I manage to POST the form, I will read the response into some kind of
DOM object, then locate the elements I'm interested in and read their
innerHtml properties.

Anyway, that's how I imagine it will work. First hurdle: POSTing the form.

http://www.fiddlertool.com/fiddler

this is an excellent debugging proxy. Once running you can use the site and
then review in detail the conversation the browser has with the site.
 
S

sklett

http://www.fiddlertool.com/fiddler

this is an excellent debugging proxy. Once running you can use the site
and
then review in detail the conversation the browser has with the site.

Thanks for the tip Anthony!

I've already downloaded and obtained the info I need (I think) - it looks
like this is what I need to send in my request:

<POST data>

POST /NPPES/NPIRegistrySearch.do HTTP/1.1
Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg,
application/x-shockwave-flash, application/vnd.ms-excel,
application/vnd.ms-powerpoint, application/msword, application/xaml+xml,
application/vnd.ms-xpsdocument, application/x-ms-xbap,
application/x-ms-application, */*
Referer:
https://nppes.cms.hhs.gov/NPPES/NPIRegistrySearch.do?subAction=reset&searchType=ind
Accept-Language: en-us
Content-Type: application/x-www-form-urlencoded
UA-CPU: x86
Accept-Encoding: gzip, deflate
User-Agent: Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; .NET CLR
1.1.4322; .NET CLR 2.0.50727; .NET CLR 3.0.04506.30; .NET CLR 3.0.04506.648)
Host: nppes.cms.hhs.gov
Content-Length: 163
Connection: Keep-Alive
Cache-Control: no-cache
Cookie: JSESSIONID=0000b48FKemf-yxDhvGDlgxB4Nf:12l6retk3

org.apache.struts.taglib.html.TOKEN=643755f453b294990412442d6e4fb304&searchType=ind&searchNpi=&firstName=David&lastName=Paskil&city=&state=CA&zip=&subAction=Search

<POST data>

If I'm reading this correctly, the ONLY data included in the content of the
request is the formfield/value pairs.

I will use Fiddler to sniff my own traffic when trying to POST - once they
look exactly the same (minus the session ID) I should be in good shape.

Thanks again,
Steve
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top