How to screen scrap a non-browser based object?

A

andav

Hello All,
Forgive me if the answer I am looking for is fairly simple.
I am still fairly new to C# & .NET.

A client of mine is looking to automate a process, initially I assumed
it would be some simple screen scraping and reporting.

Looking into the project some more, what happens is that they go to a
site and the site loads the following object:

<!-- Start IE Object Tag -->

<OBJECT ID="Seagull Web-to-Host Control Module v3"
CLASSID="clsid:037790A6-1576-11D6-903D-00105AABADD3"
CODEBASE="../sglw2hcm.ocx#Version=3,2,1,296" HEIGHT=0 WIDTH=0>
<PARAM NAME="IniFile" VALUE="default.ini">
<PARAM NAME="Sessions" VALUE="MD_S1">
<PARAM NAME="MD_DistFile" VALUE="display.e3d">
<PARAM NAME="MD_S1" VALUE="mfdisp1.zmd">
<PARAM NAME="MD_S1_Save" VALUE="No">
</OBJECT>

<!-- End IE Object Tag -->

Which in-turn starts a new window (not a browser window) that is
basically a terminal emulator to a 3270 mainframe screen.

They then enter the information into the mainframe to gather their
data.

I am fairly familiar with screen scraping legacy systems using tools
such as expect, and screen scraping web sites using sockets and web
requests.

I am curious if someone has ever "screen scraped" a windows application
that is spawned from a web site?

Best Regards,
Anthony
 
C

cecil

Anthony,

You might be able to us
System.Diagnostics.Process.GetProcessesByName("ProcName") and the
retrieve the main windows handle, from there you could use win32 API to
get the handle of the textbox being used to display the text you want.
With the handle of the textbox you can get or set the contents. I am
no whiz with the win32 API but with a little research I have been able
to accomplish similar tasks.
Hope this helps

Cecil Howell MCSD, MCAD.Net, MCT
 
A

andav

Thanks Cecil,
I used GetProcessByName("procName") and then use EnumChildWindows
(win32 API) and SendMessage WM_GETTEXT, but I see in MSDN that
WM_GETTEXT will not retrieve the text from another applications Edit
child window.

Has anyone else found a workaround for getting the text from an Edit
window running in another application?

I am fairly comfortable with the Win32 API's so an example in C or C#
will work for me.

Best Regards,
Anthony
 
N

Nick Malik [Microsoft]

No offense, Anthony, but you are attempting to "drive" an application where
the maker of that application makes an API available, specifically for
driving it.

Permit me a little incredulity: why does it make sense to do this? How
little is your time worth?

--
--- Nick Malik [Microsoft]
MCSD, CFPS, Certified Scrummaster
http://blogs.msdn.com/nickmalik

Disclaimer: Opinions expressed in this forum are my own, and not
representative of my employer.
I do not answer questions on behalf of my employer. I'm just a
programmer helping programmers.
 
A

andav

Thanks Nick,
The maker of the application does not provide an API for driving the
application, but for connecting to and driving mainframe screens.
In this situation the owner of the mainframe has blocked end users from
accessing the API used to drive the mainframe screens because the
mainframe is a state courts mainframe.
My client wants an automated process for retrieving information from
this application... so I am left with screen scraping it.
Obviously my time is worth the effort spent on screen scraping it.
But thank you again for your input.
Anthony
 
N

Nick Malik [Microsoft]

Hello Anthony,

I am familiar with Seagull software. I have used some of it in the past.
Among the utilities that they offer is the ability to do "web scraping."

Also, having worked for a courthouse in the past, I agree that they can be a
little stifled in their ability to understand the value of systems
integration. I have also done screen scraping applications in the past (for
financial institutions), usually using a very old API called HLLAPI. I've
written code against WRQ software, as well as Attachmate software, in
addition to Seagull. I understand where you are coming from.

While I appreciate that you are "downstream" from the application, I urge
you to get in touch with Seagull directly. They may still have a solution
for you that will allow you to save time and energy, and that will survive
the next version upgrade of their access tools (which can happen at any
time, without forwarning to you or your client, thus completely disabling
your software).

I do hope that I have helped.

Have a good holiday,
--
--- Nick Malik [Microsoft]
MCSD, CFPS, Certified Scrummaster
http://blogs.msdn.com/nickmalik

Disclaimer: Opinions expressed in this forum are my own, and not
representative of my employer.
I do not answer questions on behalf of my employer. I'm just a
programmer helping programmers.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top