Web/PHP programming – detecting spiders/robots?
Is there a reliable way to detect non-malicious spiders/robots/crawlers in a PHP script? (Malicious ones can easily impersonate browsers, so that’s out of scope for my question) I’m looking for something like an open-source User Agent list that’s updated periodically. Does any such thing exist? Thanks
Snow is right.
-Billy
i think this is better handled by a robots.txt file on your server but see the links below
References :
http://www.user-agents.org/
http://www.jafsoft.com/searchengines/webbots.html
http://www.robotstxt.org/orig.html
http://ptlis.net/source/php-content-negotiation/
Snow is right.
-Billy
References :
Hi,
Make a file called robots.txt
put this in it
User-agent: *
Disallow: /
put it in your root of each domain you host.
That sorts the robots out.
To find out who it is requesting them you can read your Apache Error and Access Log files. (PHP can do that also)
References :