Web stats and robots

2008-01-08 One-minute read

I was analyzing the web traffic of some of our most popular member sites to try to understand how to better handle high traffic and I found an interesting statistic. Here are the top four IP addresses in the web logs of one member’s web site (over a period of five days).

	IP ADDRESS        NUMBER OF LINES IN ACCESS LOG     61729        47576        44039       6560 

What we’re seeing are three IP addresses that are dominating the site with between 7 and 10 times the number of hits than any other IP address. Who are these people?!? Well, the first one identifies itself as Google and the second two as Yahoo. In other words, they are robots feeding search engines. The fourth looks like a regular web browser (running Opera!!).