« Back to Search Engine...

Managing Multiple Sitemaps

Tags: seo

Introduction #

Is easy to use more than one sitemap in shared/virtual hosting with Google’s Webmaster Tools or Yahoo’s Site Explorer because you can set different names to each sitemap for each site or directly explores the site, but this becomes complicated with other search engines. Fortunately, we can set the source of a sitemap in site’s robots.txt file with this line:

Sitemap: http://www.yoursite1.com/sitemap.xml

But how can I have a different robots.txt file for each instance in a shared hosting?

Multiple robots.txt #

A sitemap-index can't solve this issue because you can't make the mapping for different domains/instances, so we had to search alternatives methods to have one robots.txt file for each domain where specify the sitemap source and the allowed/disallowed urls to the search engines. This method has a 'little' handicap... you need to have access to Apache's httpd.conf.

Based on rewrite methods used in SEO we tried to resolve the

http://<domain>/robots.txt
requests changing the .txt destination file. For example, when a search engine wants the robots.txt we redirect in apache to the file robots_site1.txt

Follow these steps:

  1. Make the file robots_site1.txt with the sitemap line. This .txt file must be in the root of the war. If you are using tomcat:
    <install_directory>/webapps/ROOT/robots_site1.txt
  2. Modify the httpd.conf file

    We have something like this:



    Pay attention in the 3 lines with the 'rewrite' command. Make sure that your httpd.conf imports the rewrite module. Search this line at the beginning of the file:

LoadModule rewrite_module modules/mod_rewrite.so }}}

  1. Restart Apache

You need to make as many robots_siteN.txt files as sites you are hosting and need too edit the httpd.conf for each domain managed in.

2 Attachments
33991 Views
Average (2 Votes)
The average rating is 5.0 stars out of 5.
Comments
Threaded Replies Author Date
I have the liferay + jboss 6.0.6, but the... Julio Hurtado May 15, 2012 1:00 PM

I have the liferay + jboss 6.0.6, but the robots.txt I have it the next level and does not work D: \ LiferayIBK \ jboss-5.1.0 \ server \ default \ deploy \ ROOT.war .. I get the message

state
not Found

The requested resource was not found.

http://192.168.1.164/web/guest/robots.txt

thanks
Posted on 5/15/12 1:00 PM.