Artificial music - Chalkdust

23 October 2019

Out of all the words in the English dictionary, art is possibly the one with the most debatable definition. In his 1897 book What Is Art?, Russian writer Leo Tolstoy argued that “art begins when a person, with the purpose of communicating to other people a feeling they once experienced, calls it up again within themself and expresses it by certain external signs”. An important aspect in Tolstoy’s argument is that of the artist’s sincerity—that is, the extent to which the artist has experienced the feeling that they are expressing—which is crucial in determining the appreciation of the work by others.

Contrary to Tolstoy’s belief is the one popularised by the French writer Théophile Gautier in the early 19th century, summarised in the slogan l’art pour l’art—art for art’s sake. For Gautier, the intrinsic value of a work of art has to be completely detached from any sort of sentimental, social or moral context.

New technologies add a layer of complexity to the old and neverending discussion about what should be considered art. What would the conversation between Tolstoy and Gautier be like after having been presented with one of Emmy’s musical compositions? Emmy, short for ‘Experiments in Music Intelligence’, was created in 1981 by David Cope, nowadays professor emeritus at the University of California, Santa Cruz. Cope, who was suffering from composer’s block, wanted to build software able to generate new material in line with his own pieces, using these pieces as the main input for the software. However, due to the lack of personal works, he started by taking the pieces of various classical composers as the input for his computer programs instead. After spending some time perfecting Emmy, Cope was able to produce, in a matter of minutes, thousands of new instances of music in JS Bach’s style. This resulted in the 1993 release of Bach by Design, one of his several computer-generated music albums.

Since Cope’s days, music-generating systems using artificial intelligence have experienced big advances. Nowadays, there are all sorts of user-friendly systems: IBM Watson Beat, Google Magenta’s NSynth Super, Jukedeck, Melodrive, Spotify’s Creator Technology Research Lab, Amper Music, and so on. Some music systems, like Amper, have explicitly been taught the rules of music theory. However, most AI music systems use artificial neural networks to generate output. The neural networks identify patterns from the multiple samples of source material they are fed with. These patterns are then used to create new music in the form of an audio file or a music score. While some systems will simply create a melody from a given note, others are able to harmonise a given melody.

A chorale harmonisation or a chorale is a musical piece traditionally intended to be sung by a congregation during a German Protestant service. It is often written for soprano, alto, tenor and bass. The soprano is the voice that holds the melody, which is usually a Lutheran hymn tune, while the other three voices provide the harmony.

For a taste of what AI is capable of doing, you can have a look at the Google Doodle from 21 March 2019, celebrating Bach’s 334th birthday. Coconet is the machine learning model that makes this Doodle work. Trained with a relatively small dataset of 306 choral harmonisations by Bach, Coconet can harmonise a melody entered by the user in Bach’s contrapuntal style in a matter of seconds. The mechanisms behind the Doodle are explored in the following section.

Coconet in a nutshell

Coconet’s task involves taking incomplete musical scores and filling them up with the missing material. For the result to be loyal to Bach’s style, Coconet needs to first be trained to know what is the ‘right’ style. This training is done by randomly erasing some notes from the original chorales composed by Bach and asking Coconet to reconstruct the erased notes. A rank is given to quantify the accuracy of Coconet’s version with respect to Bach’s. Coconet will then be encouraged to repeat high-ranked guesses in future reconstructions of incomplete music scores, while trying to avoid low-ranked guesses.

So how is the music extracted from probability distributions? One could think naively that it is OK to just pick the pitch which corresponds to the highest probability assigned to the missing notes for each voice independently. However, Bach chorales are all about harmony and harmony is all about interactions between notes; the melodic lines of the different voices cannot be considered in isolation.

To account for these interaction effects, there are several solutions. Perhaps the most obvious one would be to assign the highest probability pitches to one of the voices, and then feed Coconet with this new version of the incomplete chorale. The model would update the probability distributions for the other voices. The process could then be iterated until all the voices are complete. Although it is simple, this solution is not ideal; very different results might be obtained depending on which voice is completed first.

Coconet opts for a more robust solution. At first, all the pitches in the incomplete chorale are filled up simultaneously according to the highest probabilities for each of the individual voices. But this result is just taken as a draft. Then, some of the guesses are randomly erased and the new incomplete chorale is fed into Coconet again. New probability distributions are obtained for the new gaps. The process, called blocked Gibbs sampling, is repeated until the probability distributions given at consecutive iterations of the process are similar enough to always give the same pitch.

The diverse opinions about the final products are as interesting, if not more, as the mechanisms behind AI-generated music. The audience’s reaction to artificially generated music was spectacularly tested at the University of Oregon in 1997. There, the pianist Winifred Kerner performed three pieces: one written by her husband, the composer Steve Larson; another one written by Bach; and the last one, generated by Emmy. After her performance, the audience was asked to guess which was which. To Larson’s despair, the audience concluded that his composition had been created by Emmy and that Emmy’s work was genuine Bach.

Larson was not the only one feeling uncomfortable about the fact that Emmy had been able to fool a whole audience. American professor of cognitive science Douglas Hofstadter, author of the 1979 Pulitzer prize-winning book Gödel, Escher, Bach, had argued a machine “would have to wander around the world on its own, fighting its way through the maze of life and feeling every moment of it” in order to produce anything similar to the masterpieces. In a 1997 article published by the New York Times, he claimed that the only comfort he could take from Larson’s experiment in front of the audience was that “Emmy doesn’t generate style on its own. It depends on mimicking prior composers”.

The introduction of this sort of sophisticated and yet easy-to-use system not only opens a philosophical discussion on what should be called art, but it also brings in an ethical problem. In 2017, music-streaming service Spotify hired AI researcher François Pochet as the new director of the Spotify Creator Technology Research Lab. The hiring added even more weight to the accusation made by the magazine Music Business Worldwide that the platform had launched several playlists authored by fictional artists. These playlists, with around 500 million streams, were mood-themed with titles such as ‘peaceful piano’ or ‘ambient chill’: precisely the kind of atmospheric musical genres that AI is really good at generating. If this music had been created by Spotify’s AI, it would mean that they could have avoided paying royalties to the rights’ owners, as technically nobody would be the owner of this artificially created music. For the amount of streams that the playlists received, the cost would be in the range of \$3m. In the end, Spotify declared that the music in the playlists was actually composed by real artists and that they were being paid the corresponding royalties.

AI-generated music is controversial, but also exciting. AI is clever enough to generate short fragments of music in the style of Bach’s chorales. Indeed, despite the expressiveness in these pieces, the composition techniques used by the German genius to compose them tend to be rather algorithmic. It is also clever enough to create nice atmospheric music. However, AI still has a lot to learn in order to be able to produce a masterpiece in its own developed style, let alone the interpretation aspect. It will be a few years until the rise of a new Leonard Cohen. But AI is on the right track. As Pablo Picasso once put it: “Good artists borrow, great artists steal”… and this is precisely how machine learning works!

Carmen Cabrera Arnau

Carmen is doing a PhD in applied mathematics at UCL with a focus on mathematical modelling of complex systems in urban environments. She enjoys maths outreach, eating cheese naan, and has been working for Chalkdust since 2017.

Can computers prove theorems?
And will we soon all be out of a job? Kevin Buzzard worries us all.
In conversation with Clifford Cocks
We chat to the crypto chief about inventing RSA... but not being able to tell anyone
On √2
Yiannis Petridis connects square roots and continued fractions
Spotlight on: Pamela Harris
Pamela E Harris's story, as told by Talithia Williams
They might not be giants
Angela Brett might not be standing on their shoulders
On the cover: Islamic geometry
Explaining the mathematics of tiling, and the cover of Issue 10

Coconet in a nutshell

More from Chalkdust