On 23 September 2014, I gave an invited presentation to a meeting of the Creative Data Club, organised by Sound and Music, the national agency for new music. Also speaking were Chris Unitt from One Further, presenting a study on how arts organisations are using Facebook, Jay Short from inition, showcasing 3D-printed visualisations of social media data, Dan Simpson, talking about his crowdsourcing of poetic composition, and shardcore, telling the hilarious and poignant tale of Alex the twitterbot. Here are the slides, plus a few audience responses and livetweets at the end. The text is based on the same handwritten notes that I extemporised from on the day.
(Photograph by Sound and Music)
1. Introduction
Valuing Electronic Music is an AHRC-funded project looking at producers of electronic music and especially electronic dance music. It’s carried out by three people: Byron Dueck, an ethnomusicologist based at the Open University, Anna Jordanous, a computer scientist based at the University of Kent, and myself. I’m an applied linguist and cultural sociologist based at the Open University.
Our focus is on how the value of this music is produced, both online and off. We initially combined interviews with producers with automated data scraping of the SoundCloud website to see who those producers follow, and who follows them. In the later stages of the research, we also collected data on a random sample of 150000 SoundCloud users. This enabled us to look at larger patterns, and also to carry out linguistic analysis on the comments that people were leaving on tracks.
2. Network A: ‘Tom’
This graph represents the SoundCloud users who follow and are followed by one of our interviewees: an electronic musician who has released several albums on independent labels and who had thousands of followers on MySpace in the days when it was cool. He only has a few hundred followers on SoundCloud, partly because he hasn’t released any new music for a while and partly because he’s reluctant to invest time in it in case it becomes the new MySpace (and not in a good way).
An arrow pointing from node X to node Y indicates that the user represented by node X follows the user represented by node Y. The size of a node depends on how many other nodes in the graph have arrows pointing towards it. Our interviewee – let’s call him ‘Tom’ – is represented by the largest node in the graph, because this graph overwhelmingly represents his followers: he doesn’t follow many people on SoundCloud. Nodes in the fan arrangement towards the top right represent users who are connected only to Tom. Every other node represents a user who follows or is followed by at least one other user who follows or is followed by Tom. So we can see that Tom didn’t just find an audience here – he found an audience of people who also form an audience for each other.
Although Tom’s followers follow each other, they don’t do so in equal measure. The nodes are coloured by city, with each of the most common five cities among people connected to Tom being represented by a different colour, and all others being represented by grey. This enables us to see both that people who follow Tom are most likely to be based in the same city as Tom – i.e. London, in blue – and that they seem more likely to follow or be followed by other followers of Tom from their own cities.
3. Network B: ‘David’
This graph represents users who followed another of our interviewees, represented by the largest node: let’s call him ‘David’. David doesn’t follow anyone on SoundCloud, but he has thousands of followers there: about as many as Tom had on MySpace. His first album has just come out, on another independent label.
This graph is coloured according to the same principles as the previous one, and among David’s followers, we see essentially the same pattern as among Tom’s: most are – like David (and indeed Tom) – based in London, and they tend to follow or be followed by other followers from the same cities. Here, the second most common city is Bristol, in green, which is also the city where the user represented by the second largest node is located. 48% of David’s followers are shared with this user. Both he and David self-identify as producers of the electronic music genre known as grime.
[n.b. There’s more on how people in different cities follow each other here.]
4. Use of genre classifications by uploaders
Using our random sample of 150000 SoundCloud users, we decided to look at the genres people were using to classify their tracks. We identified both the single most commonly used genre and the three most commonly used genres for every user who had uploaded tracks.
This graph shows the most common genres, with the size of each node reflecting the number of users in our sample that primarily uploaded tracks in that particular genre. Links between genres reflect how frequently each pair of genres appeared in a single user’s three most frequent genres. If one user in our sample has both techno and jazz in his or her top three, there will be a very thin line linking the ‘techno’ and ‘jazz’ nodes. If lots of people do, there will be a thick one.
We then automatically identified highly connected groups of genres using an algorithm called the Louvain community detection method. This indicated that there were three broad clusters of genres in the above graph. We might call them ‘macro-genres’. In blue, there are electronic dance music genres such as dubstep and house. In red, there are urban music genres such as hip hop and rap. In green, there are all the other genres, from metal to country.
This suggests that if you commonly upload tracks in a particular genre, you’re more likely to upload tracks in other genres within the same macro-genre than you are to upload tracks in genres within one of the other two macro-genres. There were lots of people in our sample who commonly uploaded tracks both in the techno genre and in the tech-house genre, for example, and even more who commonly uploaded tracks both in the hip hop genre and in the rap genre, but there were very few who commonly uploaded tracks both in the hip hop genre and in the techno genre, even though hip hop and techno were among the most commonly uploaded genres overall.
[n.b. There’s a fuller explanation of the methodology here and here, though the data analysed there is less robust.]
5. Following relationships between uploaders of popular genres
This graph shows the same genres, coloured as above, but it focuses on relationships of following, not uploading. If one techno uploader follows a hip hop uploader, that means that there will be a link with a small arrowhead pointing from techno to hip hop. If many techno uploaders follow hip hop uploaders, that means there will be a link with a big arrowhead. The more incoming arrows and the bigger the arrowheads, the bigger the node representing a given genre.
As you can see from the way the colours are grouped together, people who upload urban music tracks tend to follow other people who upload urban music tracks, and people who upload electronic dance music tracks tend to follow other people who upload electronic dance music tracks, but people who upload tracks in other genres are much less likely to be followed – even by each other.
6. Comments wordcloud
This wordcloud, created using wordle.net, shows the words most commonly used in comments on SoundCloud tracks by members of our sample. One of the most striking things you’ll notice is how positive the words are. This reflects how much positive evaluation is done in SoundCloud comments. A wordcloud based on YouTube comments would probably look very different, because commenting on that website is so antagonistic that somebody’s even written a book about it.
While the prominent words in the above are very friendly, they are also notably masculine (‘man’, ‘bro’, ‘mate’, etc). At our public event, grime producer Paul Lynch (better known as Slackk) observed that while many of the people that turn up to his raves are women, the people who interact with him on SoundCloud and Twitter are all men. This obviously has implications for the use of data from these websites to study value, because it suggests that the acts of valuing they record may disproportionately reflect the judgements of men.
[n.b. There’s more extensive discussion of this wordcloud here.]
7. Dubstep comment keywords
The wordcloud shows the vocabulary of SoundCloud comments in general, but it turns out that different words are used to evaluate works of different genres. To explore this, we took the four genres that were most frequently commented on by members of our sample – dubstep, techno, hip hop, and house – and created a corpus of comments that sample members had left on tracks of each of these genres (excluding their own tracks). We then used the corpus linguistic analysis program, AntConc to identify the key words for each of these genre-based corpora.
A word has high ‘keyness’ for a particular genre if it is more common in comments on tracks of that genre than it is in comments on tracks of the other three genres. As you can see from the above table, the word with the highest keyness for the dubstep genre was ‘sick’. In every thousand words of comments on dubstep tracks, the word ‘sick’ cropped up nearly 19 times – but it cropped up far less frequently elsewhere. So the high keyness for this word indicates that SoundCloud users in our sample used this word to praise dubstep tracks far more frequently than they used the same word to praise techno, house, or hip hop tracks. Those genres appeared to have different evaluative vocabularies.
8. Hip hop comment keywords
When we look at the equivalent table for comments left on hip hop tracks, the first thing that we notice is that ‘sick’ has become ‘dope’. The second most key term, ‘shit’, was also a positive term, as in ‘good shit’. Other words in this table reflect the fact that comments on hip hop tracks had a much more explicitly interactive style than comments on tracks in various genres of electronic dance music genres. Many of them were overtly styled as rapper-to-rapper expressions of respect, and they often directly invited reciprocation, e.g. ‘Dope shit bro, check me out, leave a comment’.
9. Contact details
This work is ongoing. As always: if you find what we are doing interesting, we’d love to hear from you.
Audience responses
The questions at this event were really thought provoking for me, because it was the first time that a presentation of the project’s findings sparked discussion of the implications for music promotion. A point that I particularly remember was the observation that if two music makers share a substantial number of SoundCloud followers (as in slide 3, above), they probably ought to consider doing a gig together.