|
2007-02-07, 09:07 PM | #1 |
They have the Internet on computers, now?
Join Date: Apr 2005
Posts: 148
|
robot.txt file
Can someone explain the value of robot.txt. Is it important?
I have created a robot.txt file and uploaded it into the root of my domain but when I do a search for it using one of the many tools available I get an error message: robots.txt file does not appear to be valid and the text points to my 404 page. What am I doing wrong? Should I be concerned? |
2007-02-07, 09:38 PM | #2 |
WHO IS FONZY!?! Don't they teach you anything at school?
Join Date: Dec 2006
Location: Poland
Posts: 41
|
__________________
Sexiest Brunettes TGP Last edited by Mutant; 2007-02-07 at 09:41 PM.. |
2007-02-08, 04:24 AM | #3 |
I want to set the record straight - I thought the cop was a prostitute
|
robots.txt tells search engines what NOT to index on your site. It's useful to "hide" your site administration consoles (for example) from SEs.
What tools do you use to search for the file? And why? |
2007-02-08, 02:51 PM | #4 |
They have the Internet on computers, now?
Join Date: Apr 2005
Posts: 148
|
I used the google tool and another website that I didn't bookmark.
Both showed an error message of: Success (200) robots.txt file does not appear to be valid Then it shows the default text of my robots.txt file but it shows the text on my 404 page not my blank robots.txt page. |
2007-02-08, 03:12 PM | #5 |
I want to set the record straight - I thought the cop was a prostitute
|
you should double check if your robots.txt file is properly uploaded in domain root (that you probably done already)
Do you have any redirections in .htaccess file? According to the documentation, blank robots.txt file is equal to allowing all bots to spider everything on your site. |
2007-02-09, 09:57 AM | #6 |
They have the Internet on computers, now?
Join Date: Apr 2005
Posts: 148
|
It is in my root directory. Yes I have a re-direct in my .htaccess file to a 404 page. Do I need to add something so the bots can read my blank robots.txt file?
|
2007-02-09, 01:03 PM | #7 |
I want to set the record straight - I thought the cop was a prostitute
|
if you have something like this in .htaccess
ErrorDocument 404 http://www.yourdmain.com/404page.html then you don't need to change it. When you upload blank robots.txt to your domain root you should be able open http://yourdomain.com/robots.txt in bowser and get a blank page. If that's not the case you are probably redirecting somewhere. Maybe it's best to post contents of your .htaccess just to be sure something is not hiding in there. |
2007-02-14, 10:01 PM | #8 |
They have the Internet on computers, now?
Join Date: Apr 2005
Posts: 148
|
Tjid is all I have in my .htaccess file:
ErrorDocument 404 http://www.stars4porn.com/missing.html AddType application/x-httpd-php .html When I type in: http://www.stars4porn.com/robots.txt I get my empy (almost) robots.txt file: ############################### # # User-agent: * # # list folders robots are not allowed to index # Disallow: Disallow: Disallow: Disallow: # # list specific files robots are not allowed to index # Disallow: Disallow: # # ############################### Everything seems to be working fine my problem is only with google: When I use their robots.txt tool I get: robots.txt file does not appear to be valid Then it shows the default text of my robots.txt file but it shows the text on my 404 page not my blank (almost) robots.txt page. |
2007-02-14, 10:12 PM | #9 |
a.k.a. Sparky
Join Date: Sep 2004
Location: West Palm Beach, FL, USA
Posts: 2,396
|
http://validator.czweb.org/robots-txt.php
while that finds the robots.txt file valid, I'm not quite sure if google might have a problem with the multiple blank disallows. try the following: Code:
User-agent: * Disallow:
__________________
SnapReplay.com a different way to share photos - iPhone & Android |
2007-02-15, 05:39 PM | #10 | |
Well you know boys, a nuclear reactor is a lot like women. You just have to read the manual and press the right button
Join Date: Nov 2003
Posts: 157
|
I would also change this line
Quote:
ErrorDocument 404 /missing.html as you're currently sending Google and all other search engines a '302 Found' message, not '404 Not Found'.
__________________
To alcohol! The cause of, and solution to, all of life’s problems |
|
2007-02-15, 06:58 PM | #11 |
They have the Internet on computers, now?
Join Date: Apr 2005
Posts: 148
|
Not following JK. This is my 404 page and it is where I want to send my 404 traffic. Where does:
ErrorDocument 404 /missing.html send them? |
2007-02-15, 07:27 PM | #12 |
Well you know boys, a nuclear reactor is a lot like women. You just have to read the manual and press the right button
Join Date: Nov 2003
Posts: 157
|
It will send visitors to the same page.
Because you have the http://www..., the server is not returning the status code "404 Not Found", which can cause problems.
__________________
To alcohol! The cause of, and solution to, all of life’s problems |
2007-02-15, 09:31 PM | #13 |
They have the Internet on computers, now?
Join Date: Apr 2005
Posts: 148
|
Just tried it JK but when i mistyped in one of my sites I didn't go to my 404 page. Just a google page not found page.
|
2007-02-15, 11:13 PM | #14 |
Well you know boys, a nuclear reactor is a lot like women. You just have to read the manual and press the right button
Join Date: Nov 2003
Posts: 157
|
Sorry, I should have mentioned it will only work on the domain that hosts the missing.html file. If you have multiple domains sending 404 traffic to that page then you will either need to mirror the page on each domain, or keep it as it was.
Is your robots.txt now validating with Google?
__________________
To alcohol! The cause of, and solution to, all of life’s problems |
2007-02-16, 09:59 AM | #15 |
They have the Internet on computers, now?
Join Date: Apr 2005
Posts: 148
|
Now working on main domain and 2 sub-domains. Still not on 3 subs. Now get google 404 error - missing page
|
|
|