First thing’s first – if you’re not signed up for Last.FM, and you are interested in diving into your musical listening history, I’d highly recommend signing up. It’s free, easy to use, and you get some decent weekly and yearly reporting right out of the box. To start tracking your listening history, all you need to do is connect your Spotify account to Last.FM in the settings on their website, and you’ll be on your way to building your very own listening data set. Last.FM currently works the best with Spotify Premium accounts, but you can also track your plays on some other players using their official tracking tool.
The best part of this sort of analysis is that anyone can do it once you have a Last.FM account. Simply use a tool like this Last.FM data export tool to extract your raw listening data into a CSV, and, after importing that file into a simple spreadsheet tool like Excel or Google Sheets, you can begin to cut your data in any way you can imagine using a pivot table or a few formulas. If you want to take it a step further, you can import your data into a heavier data analysis or visualization tool, such as my program of choice, Tableau. I go into more detail about the process of accessing, manipulating, and importing your Last.FM data in previous posts, so make sure to check that out before you start on your data journey.
Let’s start with the basics: the chart above shows a month over month breakdown of my scrobbles (plays) over the past two years. In 2018, I had a total of 19,400 scrobbles, which was increase of 40% over 2017. This is due to me being more active in ensuring that my listening history was properly documented on Last.FM, and an effort on my part to listen to and discover as much music as possible. One of my New Year’s resolutions at the beginning of 2018 was to listen to more music each month, and I accomplished that goal for the first 8 months of the year.
The above chart displays all 19,400 of my plays in a colorful bar chart, where the taller a bar is, the more scrobbles I had on that day. Each little colored box in each bar refers to a different track/artist to represent the distinct number of different songs that I listened to on a particular day. You can begin to make some general conclusions out of this visual, but besides looking pretty cool and hinting at the cyclical nature of my listening habits, it doesn’t really allow for anything truly in depth.
Here’s a different and much more interesting way to visualize the same data, where the vertical axis refers to the month, the horizontal access corresponds to the calendar day of each month, and the boxes show how many plays I had on a particular day, with a redder/hotter color denoting a higher amount of listening. For instance, on May 7th, I had 133 scrobbles, which was the most for me in one day in all of 2018. During the year, I averaged 53 scrobbles a day (19,400/365), so I would consider any day with over about 80 scrobbles to be a “heavy listening day.”
Another way to look at this data is by day of the week. Last year, these visualizations showed me that the majority of my listening occurred on weekdays, especially days where I was at work where I can listen to music for a good part of the day/during my commute. The same pattern holds this year, with an average weekday having nearly three times as many scrobbles as an average weekend day. Also similarly to last year, the weekday numbers aren’t significantly far apart, with the total number of scrobbles on Tuesdays vs. Wednesdays being separated by only 65 plays. This year, Mondays were far and away my highest listening day, last year, that honor went to Tuesdays.
Let’s go a little further with this analysis using a box and whisker plot. Each box and whisker in the above visualization refers to a day of the week. Looking at the data this way is remarkably informative, since now we can see the individual data points for each day that rolled up into the day of the week circles above. The goal of this sort of visualization is to easily show 1) the median of the data (denoted by the vertical line in the middle of the shaded box, 2) the range of the data (i.e the spread/the minimum and the maximum of the data, and 3) any outliers in the data. For example, on Mondays, the Median was 71 plays/scrobbles – meaning that on half of the Mondays in the year I listened to more than 71 songs and vice versa. The max is the aforementioned 133 scrobble day on May 7th, while the minimum was 1 play on January 1st. It’s also easy to see in this visual how much less music I listen to on the weekends – the median number of plays on Saturday and Sunday hover around 20.
Looking at the outliers are where things start to get really interesting. On this chart, outliers are represented by dots that aren’t covered by the box or whiskers – they fall to the outside of the vertical lines. The outliers are data points which, like it’s name may suggest, lie outside of the bulk of the data. They do not match up to the rest of the data, and they are outside the trends which the rest of the data points to (for those with a statistics background, Tableau calculates outliers as being 1.5 times the interquartile range). On the weekdays, the outliers are on the low end – check out Tuesday, Wednesday, and Thursday, where there are 5 total outliers in the data – all on the low end, representing days where I listened to 10 scrobbles or less. The exact opposite is true on the weekends – the outliers are above/to the right of the box and whisker plot. Put simply, contrary to the weekdays, the outliers on weekend days are where I listened to more music.
This is an update to a clock-based visualization I made last year, representing the number of scrobbles I had per hour of the day. Unsurprisingly, the data here is consistent with the conclusions from last year: I listen to the most music during the working hours, with a slight dip during my commuting hours and during high meeting/lunch time in the late morning/noon time. In the re-classified visualization below, green represents my typical commuting or getting ready for work hours, orange represents time at work, pink is early evening time at home, and blue is typically hours when I’m sleeping. I’ve included weekends in both of these visualizations (the results do not shift much as a result of the weekends being included).
Coming up in the next few weeks, I’ll be breaking down my 2018 listening data by top songs, albums, and artists, dig into the Echo Nest data well once again to see how my musical preferences have changed year over year, and I’ll be taking a closer look at how I listen to my seasonal playlists.