skip to content

What about further development of /robots.txt?

<< | >>

There are no efforts on this site to further develop /robots.txt, and I am not aware of technical standards bodies like the IETF or W3C working in this area.

There are some industry efforts to extend robots exclusion mechanisms. See for example the collaborative efforts announced on Yahoo! Search Blog, Google Webmaster Central Blog, and Microsoft Live Search Webmaster Team Blog, which includes wildcard support, sitemaps, extra META tags etc.

It is of course important to realise that other, older robots may not support these newer mechanisms. For example, if you use "Disallow: /*.pdf$", and a robot does not treat '*' and '$' as wildcard and anchor characters, then your PDF files are not excluded.