Discoveries in web proxy logs

Below is a list of the top 100 IP addresses that have made requests to my website since the beginning of November. It’s a fascinating list, no?

The List

Turns out a list of 100 anything takes a bit to scroll through…

Observations

First off, before you accuse me of surveillance, consider: all this tells me is that a machine registered at the follow IP address sent an HTTP request to my website. It does not tell me the identity or location of the requestor.

I love how international this list is. Even though some of these are probably users (or bots) using software to further mask their identity behind a wildly inaccurate IP address and the vast majority are probably aggregators of various kinds blindly polling the Internet, it makes me feel more tethered to the world to see so many places represented. French bots are registering my pages, woohoo! And there are a few (can you spot them?) which might, just maybe, be requests from REAL PEOPLE!

I’ve since blocked the last requestor’s IP address after I realized that the requests from IP Address 109.102.111.58 were suspicious. There are dozens of automated requests from aggregators, but these self-identify as such. 109.102.111.58 did not self-identify - even worse, it masks its identity behind a randomized identity. Not cool.

That may be a long list, but I’m an obscure guy. Here are my top three posts by count:

  1. (79) /posts/rust-server-docker/

  2. (63) /posts/choose-leaders-over-companies/

  3. (39) /posts/todderish/

Good thing I don’t do this for the fame…

Filters

Most of you won’t care about this but, for the curious, I did filter out a few requestors.

First, my own IP address. I’m my own biggest fan!

Second, self-identified bots. A lot of them.

and my favorite…

Serendeputy-bot - shout-out to Jason Butler for creating a landing page that explains why his bot is crawling my site. You can send your bot over anytime, Mr. Butler.

Third, a very special bot, Bytespider. A younger generation might guess who this is from - the same umbrella company that gave us TikTok. It’s behavior is a bit stranger than most. For some reason, it also queries my pages with an appended “depth:[integer]” sometimes even replacing part of the path. For example, it will query this post, /posts/my-biggest-fans and also /posts/my-biggest-depth:2. What that’s about?

That’s more than enough list-making for one day. Congratulations to any of you who might recognize your IP address on this list - you’re enlisted in my Hall of Fame!