www.BrettDaniel.com

Stats Page Improvements

Weblogs are sometimes criticized for obsessive navel-gazing. Exhibit A: my improved stats page. Since posting it last September, I have made many additions and incremental changes.

Three New Graphs

I added three new graphs plotting comments, links per entry, and sorted entry lengths.

The comments graph was the easiest to put together. It was nearly the same MySQL query that I use to display the count in the comment link below each post. I decided to truncate the graph since I added commenting only last August.

"Links per Entry" required more thinking. To determine the number of links in an entry, I needed to count the number of occurrences of the string "<a href=". I could have done this in the PHP code after retrieving every entry, but I wanted to leave the counting to the SQL statement (I'll explain why under "Top Five Lists"). MySQL lacks a function that would do this, so I used a messy hack instead. I removed every occurrence of "<a href=" from the entry text and subtracted the length of that new entry from the original length. This gave me a count of the number of characters removed from the original entry. I then divided this result by eight, the length of "<a href=", to give me the number of links in each entry.

SELECT (char_length(entrytext) - char_length(replace(entrytext,'<a href=',''))) / 8 AS occurrences ...

The "sorted entry lengths" graph is certainly the most beautiful of any of the graphs I have plotted so far. It shows what you might have already guessed from looking at the normal entry length graph: I have only a handful of longer posts and a boatload of short ones. It is a perfect illustration of The Long Tail or the 80-20 Rule.

I thought there might be a normal distribution hidden in there somewhere, so I tried various ways of grouping entries by length and counting the number of entries in each group. It was pretty messy. Because my entry length has increased over time, I do not have the comfortable average that would have made a nice bell curve.

Top Five Lists

The top five lists provide links to the five most important peaks or troughs that you may have wondered about when looking at the graphs. I just took the SQL statements used to create the graphs, stuck on an ORDER BY, and limited them to five results each. Easy.

This was where the SQL substring counter became useful. When finding the five entries containing the most links, it was far easier to send a single SQL query than to use PHP to loop through every entry and extract the correct ones.

The makeGraph() Source

I posted the makeGraph() source code for anyone who may want to make his or her own graphs. Let me know if you find it useful for your own site or make any improvements to the code.

No Comments

Comments are closed.