How to index HTML files locally even with ROBOTS noindex?

Guest · Sep 14, 2007

I have a local mirror copy of the Web sites I manage. Some of the HTML pages
I don't want to be indexed by Web spiders / robots, so I put the ROBOTS
meta-tag with "noindex" in them.

However, I would like those files to be indexed locally, so that I can find
things in them with the local Windows indexed search function. But the
Windows HTML filter intentionally does NOT index files with ROBOTS noindex,
so I don't get those files in my local searches.

Is there a way to tell the HTML filter to go ahead and index HTML files even
if they have the ROBOTS noindex meta-tag? I want my local and remote copies
to be indentical, so I don't want to have ROBOTS index locally and ROBOTS
noindex remotely.

Anybody else run into that problem? Anyone has a solution?

Thanks!

YMA

Synapse Syndrome · Sep 14, 2007

YMA said:
I have a local mirror copy of the Web sites I manage. Some of the HTML
pages
I don't want to be indexed by Web spiders / robots, so I put the ROBOTS
meta-tag with "noindex" in them.

However, I would like those files to be indexed locally, so that I can
find
things in them with the local Windows indexed search function. But the
Windows HTML filter intentionally does NOT index files with ROBOTS
noindex,
so I don't get those files in my local searches.

Is there a way to tell the HTML filter to go ahead and index HTML files
even
if they have the ROBOTS noindex meta-tag? I want my local and remote
copies
to be indentical, so I don't want to have ROBOTS index locally and ROBOTS
noindex remotely.

Anybody else run into that problem? Anyone has a solution?

I just put a robots.txt file in the root folder of the website instead. I
do not know of the metatag, but maybe the text file is more flexible, as you
can define which folders the spiders can index or not.

Loads more info here:
http://www.google.co.uk/search?sourceid=navclient&ie=UTF-8&rlz=1T4SUNA_enGB227GB227&q=robots.txt

ss.

Synapse Syndrome · Sep 14, 2007

Synapse Syndrome said:
I just put a robots.txt file in the root folder of the website instead. I
do not know of the metatag, but maybe the text file is more flexible, as
you can define which folders the spiders can index or not.

Loads more info here:
http://www.google.co.uk/search?sourceid=navclient&ie=UTF-8&rlz=1T4SUNA_enGB227GB227&q=robots.txt

Also, it says that not all spiders listen to the metatag, according to this
page:
http://www.robotstxt.org/wc/exclusion.html#meta

ss.

Guest · Sep 14, 2007

Thanks for your answer, but I do not have access to the root folder of my
websites (with just one exception). So, I really need to be able to tweak the
local HTML filter on my machine...

BTW, I am aware of the limitations of the ROBOTS noindex meta-tag, but I can
live with them.

YMA

FP and preventing robots	3	Sep 16, 2006
Create a folder with "noindex, no follow?"	9	Jan 11, 2006
Problem with empty page	3	Apr 15, 2010
Robots	2	Oct 12, 2003
can username and password prevent search engines to index and cache my page?	1	Dec 13, 2003
Analysis says I have no Meta Robots - Where/How?	1	Nov 21, 2004
Magic changing HTML?	4	Jun 24, 2006
Meta Tags	2	Apr 12, 2005

How to index HTML files locally even with ROBOTS noindex?

Guest

Synapse Syndrome

Synapse Syndrome

Guest

Ask a Question

Similar Threads