For the past several days (today definitely included), I’ve had craploads of “visits” from a slew of connections with addresses all looking very similar. A few examples:
bl1sch4090511.phx.gbl
bl1sch4084504.phx.gbl
bl1sch4084204.phx.gbl
All of them end with this “.phx.gbl”, and when I searched the IP addresses via WHOIS, they all seem to belong to Microsoft. All these visits come via searches for inane terms which I know don’t return my site anywhere near the top. Really, do you even think my page is on the first 50 pages of search results for the word, “would”?!?
I know these hits are from a search bot, but why is my site registering hits from them at all? Do the bots load all the pages in the search results to confirm they’re reachable? It’s starting to piss me off because it throws off my stats quite a bit, and I wish there was a way to ignore them.
5 Comments
you can’t exclude all “visits” with a .phx.gbl? Seems like something that any analytics software would allow in its filtration system.
I can have it ignore referrals, but I’m not sure if that means bot visits. Maybe it does.
I just tried modifying the filter I had in there from “phx.gbl” to the following:
*.phx.gbl
But I don’t know if it looks at wildcards like that.
Make sure that the filters don’t take Regex–your dots will mean any non-newline character.
If they DO take Regex, try this:
(.*)\.phx\.gbl
The parens aren’t actually necessary. That’s what I get for dreaming about mod_rewrite.
.*\.phx\.gbl
One of the default filters was typed just like this:
images.google.com
So I don’t think i need escapes or anything…but it would be nice if a wildcard worked. I can tell that simply using the asterisk does not.
Post a Comment