Robots.txt

F

Fred

Could any kind person show me EXACTLY what one should put on the Robots.txt
(notepad) to ALLOW all search engine spiders to crawl and index the whole
website. Fred
 
T

Tom Pepper Willett

http://www.searchengineworld.com/robots/robots_tutorial.htm
--
===
Tom "Pepper" Willett
Microsoft MVP - FrontPage
---
About FrontPage 2003:
http://office.microsoft.com/home/office.aspx?assetid=FX01085802
FrontPage 2003 Product Information:
http://www.microsoft.com/office/frontpage/prodinfo/default.mspx
Understanding FrontPage:
http://msdn.microsoft.com/office/understanding/frontpage/
FrontPage 2002 Server Extensions Support Center:
http://support.microsoft.com/default.aspx?scid=fh;en-us;fp10se
===
| Could any kind person show me EXACTLY what one should put on the
Robots.txt
| (notepad) to ALLOW all search engine spiders to crawl and index the whole
| website. Fred
|
|
 
T

Thomas A. Rowe

The purpose of a robots.txt file is to exclude robots from your site or from indexing certain
content. If you don't have one then your site is open to all robots.

--
==============================================
Thomas A. Rowe (Microsoft MVP - FrontPage)
WebMaster Resources(tm)

FrontPage Resources, WebCircle, MS KB Quick Links, etc.
==============================================
If you feel your current issue is a results of installing
a Service Pack or security update, please contact
Microsoft Product Support Services:
http://support.microsoft.com
If the problem can be shown to have been caused by a
security update, then there is usually no charge for the call.
==============================================
 
F

Floyd

Exactly. If you want all searchbots to access and index your site freely:
don't configure a robots.txt
 
J

Jon Spivey

Hi,

You probably wouldn't to let every spider go anywhere on your site, as a
first step it's a good idea to block your /images directory just to save the
bandwidth. Also there's a lot of spiders out there who will crawl you
without any benifit to you - it's a good idea to block them both to save the
bandwidth and to speed up your site for paying customers.
 
T

Thomas A. Rowe

However, "bad" spiders generally do not honor the robots.txt file even when then read it.

--
==============================================
Thomas A. Rowe (Microsoft MVP - FrontPage)
WebMaster Resources(tm)

FrontPage Resources, WebCircle, MS KB Quick Links, etc.
==============================================
If you feel your current issue is a results of installing
a Service Pack or security update, please contact
Microsoft Product Support Services:
http://support.microsoft.com
If the problem can be shown to have been caused by a
security update, then there is usually no charge for the call.
==============================================
 
D

David Baxter

User-agent: *
Disallow:

These two lines will "disallow nothing", i.e., "allow everything" for SE
spiders.

If you'd rather prevent spiders from indexing some of your directories
or files:

User-agent: *
Disallow: error.htm
Disallow: /cgi-bin/

The first line above disallows (excludes) an individual file from being
indexed.
The second line excludes a directory off the root directory (cgi-bin).
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads

Robots.txt file. 2
The need for robot.txt file on website 4
Did I do this right? 7
robots.txt 13
OT: Opinions on Robots.txt 1
/robots.txt at end of URL? 4
Hiding pages from Google 6
password protect a page 5

Top