Blog stats – Jeffrey Pomerantz

I have had neither the time nor the inclination recently to post anything here. But for reasons I don’t fully understand, I am unwilling to just call it quits on this blog. Maybe I’ll regain interest in blogging eventually.

In the meantime, I’ve done some simple analysis of my categorization of posts. This figure includes all posts except this one.

Blog stats graph

Bates (1998) states that “most information-related phenomena have been found to fall into [this] class of statistical distributions” (p. 1193), by which she means the Zipf / Bradford / Pareto / whatever-you-want-to-call-it curve. Nice to know I’m in good company. Though in fact the curve of my blog post categories is not really 80/20; really it’s more like 60/40. There are 23 categories. 20% of 23 = 4.6, round up to 5. The top 5 categories constitute 58% of all posts. You have to get to the 10th category before you get to 80% of the posts; actually 81%. Another thing to note is that General is the 2nd-largest category. Any classification scheme where the Miscellaneous category is that large is a bad classification scheme.

Bates, M. J. (1998). Indexing and Access for Digital Libraries and the Internet: Human, Database, and Domain Factors. Journal of the American Society for Information Science, 49(13), 1185-1205.