a problem help me

Dhananjay · Nov 28, 2006

hello everyone
Do you have any information how to generate a tool using .net which is
used to translate the web page contents to html format.

Plz reply me asap

Thanks in advance

Dhananjay

Marc Gravell · Nov 28, 2006

Given that most web pages are *in* HTML (or a variant), it wouldn't
have a lot to do...

Can you clarify what you mean?

Marc

Dhananjay · Nov 28, 2006

Marc said:
Given that most web pages are *in* HTML (or a variant), it wouldn't
have a lot to do...

Can you clarify what you mean?

Marc

hello marc
my problem is
first thing i have to import a client to a website(specified website
,and there may may be more than one website) then i have to generate a
tool which has to convert web page contents to html format save this
html format to a database(sql server).
how to achieve this
could you plz help me to do this.

Reply me asap

Thanks
Dhananjay

Marc Gravell · Nov 28, 2006

You don't need to "convert" anything here. The website is probably in
HTML already. If it isn't you won't be able to do it. You may be able
to use WebClient here to simply download the text (see MSDN2) - but
even then it won't be a usable static copy, as all images, scripts,
cookies, links etc will probably be dead if you *just* use the HTML in
isolation of the other stuff.

You could try and use WebBrowser to export as an mht; never tried it -
might work. Alternatively if it is for later reference you might try
using tools like HTMLDOC to create a standalone PDF of the page (hence
including the images but not scripts).

Alternatively you can find lots of crawlers on google to do this for
you.

It really depends on what *exactly* you need. And unless this is
somehow a C# issue you may find other groups more useful.

Marc

Dhananjay · Nov 28, 2006

Marc said:
You don't need to "convert" anything here. The website is probably in
HTML already. If it isn't you won't be able to do it. You may be able
to use WebClient here to simply download the text (see MSDN2) - but
even then it won't be a usable static copy, as all images, scripts,
cookies, links etc will probably be dead if you *just* use the HTML in
isolation of the other stuff.

You could try and use WebBrowser to export as an mht; never tried it -
might work. Alternatively if it is for later reference you might try
using tools like HTMLDOC to create a standalone PDF of the page (hence
including the images but not scripts).

Alternatively you can find lots of crawlers on google to do this for
you.

It really depends on what *exactly* you need. And unless this is
somehow a C# issue you may find other groups more useful.

Marc

hello marc
anyway thanks for spending time on me.
what you have suggested i tried it but its not working, its saying
namespace problem.i think this feature is different. i am using vs2005
C#
will you tell me one thing either my problem will be solved by creating
windows appln or web appln.first i have to import client to a website
and then generate a tool to convert webpage contents to html format
save it to sql server databse.
first i was doing with vb.net i have generated a tool which converts
webpage contents to html format , but same thing its not working
inC#.net.

plz reply me
Thanks
waiting for your reply asap
Dhananjay

Kodali Ranganadh · Nov 28, 2006

Hi Dhananjay,

Ok Working on the HttpRequest, and Response objects, These are very
help full for u. Simple give a Request to the specifc URL by using the
HTTPRequest or HttpWebrequest, and then save the content stream of the
response in to the U R DataBase. this part will be simple get the page
u put the request, But u r aim is to get the whole Website, so search
the other links in the Main response stream and form a URL and process
same way...

Nick Malik [Microsoft] · Nov 28, 2006

Hello Dhananjay,

First off, your English is vague. This leads to some misunderstanding.
More on that below.

Secondly, it is not clear what BUSINESS PROBLEM you are trying to solve.
Before you jump to "what is wrong with my solution," please help us to
understand what problem your code is trying solve. There may be a better
way than writing code!

Thirdly, if you have written code, and it is not working, please post it.
That provides a great deal of information for us to help you.

Now, back to your request.

You said:

first i have to import client to a website
and then generate a tool to convert webpage contents to html format
save it to sql server databse.

1. I do not know what this phrase means "import client to a website" I have
no idea what you are trying to accomplish. Can you use different words to
describe what you mean?

2. I do not know what is difficult about this: "convert webpage contents to
html format" since nearly all web pages are already in HTML format. That is
the nature of the web. All browsers begin by reading HTML. Note that if
the HTML in your target web page is constructed on the fly using Javascript,
then you are going to have a TOUGH time emulating that in C# code.

3. You want to "save it to a sql server database". What is "it" that you
are saving? Each page? Each element on a page? The content of the page?
Why save it to SQL? Do you intend to look up pages using SQL queries? Why
not save it as a web site and use HTTP to get the pages?

I want to help. But until you answer some of these questions, I won't be
terribly helpful.

Note: Are you looking for something like WinHTTrack? This tool is useful
for visiting a web site and creating, on your hard drive, a complete copy of
the site with links intact. It's fairly friendly and easy to use.

--
--- Nick Malik [Microsoft]
MCSD, CFPS, Certified Scrummaster
http://blogs.msdn.com/nickmalik

Disclaimer: Opinions expressed in this forum are my own, and not
representative of my employer.
I do not answer questions on behalf of my employer. I'm just a
programmer helping programmers.
--

Dhananjay · Nov 28, 2006

Nick said:
Hello Dhananjay,

First off, your English is vague. This leads to some misunderstanding.
More on that below.

Secondly, it is not clear what BUSINESS PROBLEM you are trying to solve.
Before you jump to "what is wrong with my solution," please help us to
understand what problem your code is trying solve. There may be a better
way than writing code!

Thirdly, if you have written code, and it is not working, please post it.
That provides a great deal of information for us to help you.

Now, back to your request.

You said:

1. I do not know what this phrase means "import client to a website" I have
no idea what you are trying to accomplish. Can you use different words to
describe what you mean?

2. I do not know what is difficult about this: "convert webpage contents to
html format" since nearly all web pages are already in HTML format. That is
the nature of the web. All browsers begin by reading HTML. Note that if
the HTML in your target web page is constructed on the fly using Javascript,
then you are going to have a TOUGH time emulating that in C# code.

3. You want to "save it to a sql server database". What is "it" that you
are saving? Each page? Each element on a page? The content of the page?
Why save it to SQL? Do you intend to look up pages using SQL queries? Why
not save it as a web site and use HTTP to get the pages?

I want to help. But until you answer some of these questions, I won't be
terribly helpful.

Note: Are you looking for something like WinHTTrack? This tool is useful
for visiting a web site and creating, on your hard drive, a complete copy of
the site with links intact. It's fairly friendly and easy to use.

--
--- Nick Malik [Microsoft]
MCSD, CFPS, Certified Scrummaster
http://blogs.msdn.com/nickmalik

Disclaimer: Opinions expressed in this forum are my own, and not
representative of my employer.
I do not answer questions on behalf of my employer. I'm just a
programmer helping programmers.

=============================================================
hello nick
As you have asked some questions.in a simple way i am trying to achieve
this:-
my plan on building a Cache System. It will import content from
different Dhananjay-Sites, translate the dhananjay-Code into HTML and
republish it in a specific format on a file system.

now will you plz guide me how to proceed so that i can achieve it
or have u developed something like this previously then send me the
resources, so that i acn easily proceed towards the target
or u want in more detail ? let me know

plz reply me asap
Thanks
Dhananjay

Nick Malik [Microsoft] · Nov 28, 2006

Hello Dhananjay,

hello nick
As you have asked some questions.in a simple way i am trying to achieve
this:-
my plan on building a Cache System. It will import content from
different Dhananjay-Sites, translate the dhananjay-Code into HTML and
republish it in a specific format on a file system.

now will you plz guide me how to proceed so that i can achieve it
or have u developed something like this previously then send me the
resources, so that i acn easily proceed towards the target
or u want in more detail ? let me know

You are building a cache system. I assume from your statement that the goal
is for a person, using their web browser, to be able to visit a web site
while online, cache the site, and then visit it again when offline. Is this
true? (Are you aware that this is built-in functionality in the IE browser?
Simply add the site to favorites and check the "make available offline"
check box.)

I will assume, given the fact that this is trivial for an individual user,
that you intend for this cache to be visited by more than one user.
Therefore, I assume that the source sites are somehow more 'difficult' to
reach or less reliable than your cache server. In that case, you need to
provide what is called a 'proxy cache' in that the users will hit your site,
looking for the web pages that they want, and your app will get the data
from the remote system, update the local cache, and serve the pages.

Of course, there is no need to write code for any of this. Simply install
ISA server. http://www.microsoft.com/isaserver/default.mspx

On the off chance that you posted on a developer forum because you'd rather
develop software than install existing stuff (;-), then perhaps the code on
this link would be helpful. It is not a proxy server. It is, instead, a
web site spider. That actually sounds more like what you are saying you
want. This link provides complete C# source code for downloading web sites
to a local hard drive: See open source code at
http://www.codeproject.com/useritems/ZetaWebSpider.asp

For a more full-featured system that caches web sites, but one that is not
written in C# (to the best of my knowledge) but is still free, check out
HTTrack. The windows version is WinHTTrack? (www.httrack.com)

I hope this helps,
--
--- Nick Malik [Microsoft]
MCSD, CFPS, Certified Scrummaster
http://blogs.msdn.com/nickmalik

Disclaimer: Opinions expressed in this forum are my own, and not
representative of my employer.
I do not answer questions on behalf of my employer. I'm just a
programmer helping programmers.

a problem help me

Dhananjay

Marc Gravell

Dhananjay

Marc Gravell

Dhananjay

Kodali Ranganadh

Nick Malik [Microsoft]

Dhananjay

Nick Malik [Microsoft]