It’s an hour after the last exam and people are coming back to my place to celebrate. However, my choice in music is terrible, and I don’t want to kill the buzz by playing things no one else likes. The solution: have a computer do it for me.
James developed a music analysis platform for his part III project, which provides a q-learning hierarchical-clustering Markov-model solution to the problem of playlist creation. The idea is to it separate music that sounds similarly into clusters, so that if there is one track in my library that I can identify as socially acceptable, the algorithms will find others like it as well. Additionally, James used modern AI techniques to model how I interact with my music in order to produce better recommendations based on different user moods (e.g. revision vs. hacking). Here’s an overview of how it works:
Feature Extraction (Learned in Part II)
The first step is to take a song and turn it into a set of descriptive features. He takes in music files that look like:
And by doing autocorrelation, a method of exaggerating repeated features and suppressing others, he turns them into signals like the following:
The high peaks labeled “good match” are then used to identify songs.
Clustering (Learned in Part IB)
Since the first goal of the project is to divide songs into clusters (playlists) based on similarity, these features need further processing. In particular, a clustering algorithm, which takes a set of unorganized points and divides them like this:
Each point here represents a song, and they are currently being clustered by two different features. However, this type of clustering doesn’t capture the notions of genres and subgenres, so James opted for a more refined hierarchical clustering algorithm, which does stuff like this instead:
What to Play Next
At this point, the algorithms have everything grouped into playlists and subplaylists, using hierarchical clustering on the characteristic features of each song. The next part of the problem is figuring out which song in the playlist to play next, given the previous song. This is done with Markov Models.
Markov Models (Learned in Part IB)
A Markov model shows how states are probabilistically linked, making the assumption that the next state is entirely dependent on the current state. So for example (probabilities are heavily adjusted because Andy can see this):
If I’ve been revising, it’s highly likely that I’ll keep revising, with only a tiny chance that I’ll end up going to a pub. And even if I make it to a pub, I definitely wont go to another one afterwards; instead it’s straight to sleep.
James took this and replaced the states (revising, sleep, go to pub) with clusters and subclusters in his model. The model then contains the likelihoods of switching artists, genres, and different songs within the same genre. It is initially created to make similar songs likely to be played in sequence, but then uses AI to learn a better way of doing this based on user actions.
The AI – Q-Learning (Learned in Pat II)
This part is more involved, so I can only give a general overview.
In essence, Q-learning is a method of teaching a computer to perform some task by shouting at it when it gets it wrong and giving it a cookie when it gets it right, learning by reinforcement. In this case, the recommender did a good job if the user listens to a track all of the way through and a bad job if the user skips over it after the first few seconds.
With this information, the system can update the Markov model describing how the next song is chosen. For more information, here’s a link to our AI II course’s notes on the topic (page 339): http://www.cl.cam.ac.uk/teaching/1314/ArtIntII/ai2-2014.pdf
Now that James has done the hard work for us, you can try out his system at james.eu.org/download to see for yourself how it works.