Beacon Hill, Inc. Technology Solutions

Home > Knowledge Base > Tomcat > Robots.txt

Tomcat and Robots.txt

If you wish to block access to certain parts of your application running under Tomcat you can use a robots.txt file much like you would with Apache.

Catch is that you need to put the file inside the webapps/ROOT directory.

For example, I am running a web application downloader that allows people to download files through a service so I can be notified, keep track of download counts and keep the files in Amazon's S3 service.

But, I notice that when I run a Link checker like http://validator.w3.org/checklink it would access the download urls and skew my numbers.

Since the application is downloader and the url is /get.htm I need to disallow /downloader/get.htm.

The following is my robots.txt for this example

apache-tomcat-6.0.18/webapps/ROOT$ cat robots.txt

User-Agent: *Allow: /Disallow: /downloader/get.htm