Spider trap detection
Spider traps, also known as crawler traps, are a major problem when crawling the internet. Using a well trained neural network, we developed an algorithm capable of detecting recursive and repetetive spider traps that cannot be caught by regular expressions.
Spider trap detector
We use this detector for our Sitesearch service to avoid crawling useless pages over and over again. If you operate a web crawler and this problem plagues you over and over, don't hesitate to contact us or try it out yourself.