Is C# appropriate for this ?

  • Thread starter Thread starter Bit byte
  • Start date Start date
B

Bit byte

I have a task which involves downloading data from some web pages. There
is a lot of messing about with forms (clicking buttons, selecting icons
etc) as well as pasrsing HTML to extract tables from the resulting web page.

I am torn as to whether to do this in PHP (or *shudder* Perl) or C#. I
know the C# language is "internet aware" (but so are the other
languages I mentioned.

Obviously, this being a C# ng, no prizes for guessing where the bias
will lie - BUT, I would be very grateful if anyone could point out where
C# may have an edge over the other languages - as well as any "gotchas"
(or drawbacks) I may need to be aware of ...
 
Bit byte,

Yes, you can definitely do this.

I would recommend that you look into using MSHTML for this. This is the
document object model that microsoft uses in internet explorer for parsing
pages that are downloaded from the web.

While you can use this through C#, you really don't have to use it if
you don't want to. MSHTML is a COM component, and therefore, accessible by
any language that offers access to COM components (.NET does, as do a good
number of other development technologies).

Hope this helps.
 
I have a task which involves downloading data from some web pages. There
is a lot of messing about with forms (clicking buttons, selecting icons
etc) as well as pasrsing HTML to extract tables from the resulting web
page.

I am torn as to whether to do this in PHP (or *shudder* Perl) or C#. I
know the C# language is "internet aware" (but so are the other
languages I mentioned.

Obviously, this being a C# ng, no prizes for guessing where the bias
will lie - BUT, I would be very grateful if anyone could point out where
C# may have an edge over the other languages - as well as any "gotchas"
(or drawbacks) I may need to be aware of ...

C# language not know internet at all, knows as much as Eskymoe knows about
curry. PHP and Perl are old fashion interpreter so slowly execute compare
to C#.
 
Bit said:
I have a task which involves downloading data from some web pages. There
is a lot of messing about with forms (clicking buttons, selecting icons
etc) as well as pasrsing HTML to extract tables from the resulting web page.

I am torn as to whether to do this in PHP (or *shudder* Perl) or C#. I
know the C# language is "internet aware" (but so are the other
languages I mentioned.

Obviously, this being a C# ng, no prizes for guessing where the bias
will lie - BUT, I would be very grateful if anyone could point out where
C# may have an edge over the other languages - as well as any "gotchas"
(or drawbacks) I may need to be aware of ...

I still haven't bitten the bullet and done more than read a PHP book,
but "anything Perl can do, C# can do better." That may be a slight
exaggeration, as Perl does have libraries for just about everything,
but it's probably not, as the FCL is awfully comprehensive. I do know
that lately all the "screen scraping" applets that I used to do in
Perl, I now do in C#:

* The FCL WebRequest is just as easy to use as the Perl
libraries that synchronously download web pages.

* The FCL Regex can do anything the Perl regex can do ... and more.

* It's just as easy to upload files (via FTP) in C# as in Perl, and
you don't even have to create a temp file.

And the kicker is that the C# is much easier to read.
 
Bhagat said:
C# language not know internet at all, knows as much as Eskymoe knows about
curry. PHP and Perl are old fashion interpreter so slowly execute compare
to C#.

Hi Bhagat,

I was wondering if you could clarify your point about C# not knowing the
internet at all. I'm wondering what you mean by this.

Thanks.
 
Hi Bhagat,

I was wondering if you could clarify your point about C# not knowing the
internet at all. I'm wondering what you mean by this.
Language of C# does not concern with internet, it just like any other
computer language specyfication. The one is confusing language of itself
and libry of classes for utilization of internet. Not need C# to use libry
of classes for utilization of internet, can use ASP.NET, VB.NET and other
..net language.
 
Bhagat said:
Language of C# does not concern with internet, it just like any other
computer language specyfication. The one is confusing language of itself
and libry of classes for utilization of internet. Not need C# to use libry
of classes for utilization of internet, can use ASP.NET, VB.NET and other
.net language.

How can somebody preach about C# being a language and knowing nothing
about the internet etc, and then go on to call ASP.Net a .net
language?? ASP.Net is just another part of the .Net FCL, it's not a
language in the slightest.
 
Bit byte,
You can use the classes in System.Net to handle much of this. Clicking
buttons, etc. is nothing more than a form post, so if you can figure out the
target you can handle that as well. If you need to do any heavy-duty HTML
Parsing to scrape out specific contents, take a look at Simon Mourier's
HtmlAgilityPack, which is written in (Gasp!) -- C#.
Peter
 
How can somebody preach about C# being a language and knowing nothing about
the internet etc, and then go on to call ASP.Net a .net language?? ASP.Net is
just another part of the .Net FCL, it's not a language in the slightest.

I find Bhagat's main point to be clear enough. And correct, AFAIK - any
knowledge of the internet is in applications, not the language itself. If you
disagree, perhaps you could post an example of "internet aware" C# syntax, or a
reference to the relevant portion of the language spec?

I hadn't noticed any particular "preaching" in this thread, not sure what you
mean by that. There may have been a bit of trivial carping though, now that you
mention it . . .

Regards,
-rick-
 
Rick said:
I find Bhagat's main point to be clear enough. And correct, AFAIK - any
knowledge of the internet is in applications, not the language itself. If you
disagree, perhaps you could post an example of "internet aware" C# syntax, or a
reference to the relevant portion of the language spec?

I hadn't noticed any particular "preaching" in this thread, not sure what you
mean by that. There may have been a bit of trivial carping though, now that you
mention it . . .

Regards,
-rick-

I would have to own up and admit that my post was pointless really as
although Bhagat's post wasn't completely accurate you could easily see
what he ment.

My apologies
 
Jon Shemitz said:
I still haven't bitten the bullet and done more than read a PHP book,
but "anything Perl can do, C# can do better." That may be a slight
exaggeration, as Perl does have libraries for just about everything,
but it's probably not, as the FCL is awfully comprehensive. I do know
that lately all the "screen scraping" applets that I used to do in
Perl, I now do in C#:

* The FCL WebRequest is just as easy to use as the Perl
libraries that synchronously download web pages.

* The FCL Regex can do anything the Perl regex can do ... and more.

* It's just as easy to upload files (via FTP) in C# as in Perl, and
you don't even have to create a temp file.

And the kicker is that the C# is much easier to read.


Hmm, I don't think C# is much easier to read. I think that the .Net IDE
makes it easier because of intellisense, colorizations, regions, etc. I
think that if perl had all this, it *could* be just as easy to read. Also,
like most other topics relating to *easier to read*, it's all based on the
developer that wrote it. I know people that can write C# that is almost
unreadable (and you'd think it went through a blender).

:) Just a side-note .. I agree that C# is better because (and this is the
best point I'm making)...I like C# better :P

Mythran
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Back
Top