My brother and I are both musicians, and we both recently read Alan Rusbridger’s Play it Again, in which he describes how, as a moderately competent amateur pianist, he took on the challenge of learning and playing publicly Chopin’s ‘impossible’ Ballade in G Minor in a year (it took him nearly a year and a half in the end, his life, as editor-in-chief of the Guardian, very much at the mercy of other global events). One of the topics he discussed, in connection with learning a piece by heart, was the concept of how much ‘information’ music contains.
We can express the question “How much information is contained within a single piece of music?” in another way: “How much information do we need to store to enable a mechanical rendition of a piece of music that would be recognisable as a true representation by the human ear?”. Such storage mediums have existed for well over a century in the form of piano rolls, and in the modern-day equivalent of digital midi files. In their simplest forms only pitch, duration and pedalling were recorded as punched holes in a roll of card. (Intensity of attack and other refinements were added later.)
In considering what a musician needs to remember in order to perform, we can presume these basic components to be adequate. The layer of information needed to build a logical program for mechanical transposition into played notes as well as the layer of human ‘interpretation’ (which amounts to refinements and distortions of the original) can be ignored.
For the sake of this investigation, we assume that the music we intend to store is written over a regular grid of equidistant pitches and over a steady grid of pulses or beats.
So how can we express in a numeric manner what is stored on our imaginary piano roll?
The foundation of music is rhythm:
For this we need two parameters:
a) the length of the rhythmic unit to which the note belongs as expressed in the number of elapsed pulses that the unit would cover.
b) the number of evenly spaced notes that the rhythmic unit contains.
Some examples, assuming the pulse to be a quaver (or eighth note):
2,3 would express a semi-quaver triplet note
2,1 would express a crotchet note
1,5 would express a quintuplet demi-semi quaver note
6,2 would express a duplet crotchet note over a three crotchet bar
But that’s not enough. We need to know when each note should start. This can be done in two ways:
a) to start immediately after a preceding note in a given musical ‘string’.
b) to start a number of pulses from a sequenced note in a relative string, as we shall see later. (For notes starting in between pulses, and avoiding the use of fractions, a rhythm-defined silence or rest is inserted.)
N.B. New strings can be started at any time and will either refer back to a sequence in another string for start positioning or to ‘zero’, the beginning of the piece.
To which we add a set of numbers describing pitch:
Pitch is most simply expressed in absolute terms. There are about one hundred possibilities on a concert piano. But we can also express pitch relative (in semi-tones) to the preceding note in a string (a melodic interval, positive or negative) and alternatively relative to a concurrent note in another string (a chordal interval, positive only). This latter method being similar to the ‘figured bass’ system of the baroque era. And for this we need each note to carry an incremental sequence within the string. The pedal and silence can be represented as special ‘notes’. Bar lines do not need to be represented, as they are only an inaudible visual aid to performers. (A piano roll does not record them.)
So now we have in the below ten variables all we need to render a whole piece of music out of a succession of linked notes:
String note sequence
Pitch in absolute terms
Pitch relative to the preceding note in the string
Pitch relative to a sequenced note in another string
Relative string number
Relative sequenced note
Rhythmic unit length in pulses
Number of notes within the rhythmic unit
Number of pulses from a relative sequenced note in a string
That’s a lot of numbers to store or ‘memorise’! But we can do better than this by optimising and compressing.
For example we could add:
An indicator that says ‘repeat’ the note and a number that says how may times.
An indicator that says ‘repeat’ the rhythm and a number that says how many times.
For example, a piece of music that consisted of repeating the same note in the same rhythm one thousand times could be described by a single ‘expression’, the note itself plus the compression extension.
Set up ’cells’ encapsulating commonly used pitch and rhythm structures such as scales, chords and chord sequences, giving them labels so that they could be referenced for re-use. These cells could be joined up sequentially, concurrently or even grouped together (larger on smaller) thus giving us the possibility of efficiently expressing repeated accompaniment passage work and sections. In other words these cells could be ‘stitched’ together – at their original pitches or relatively transposed.
Thinking of a simple canon such as Frère Jacques, it’s not hard to imagine how such a piece could be represented with minimal information.
The list of possible compression options is endless. However, there’s a balance to be preserved, whereby the referential complexity should not outstrip a simpler purely sequential representation.
So, going back to the original question, how much information is contained in a piece of music?
Considering that we might need 10 variables of 3 digits for each note in uncompressed format, this translates into approximately 100 ones and zeroes for each note. So the calculation is quite simple. The amount of binary information needed is the number of notes in a piece multiplied by 100. However this will be a maximum. For pieces that contain a degree of repetition and re-use of common patterns, this can be reduced considerably. True, new ‘stitching’ variables need to be added to each note expression, but this is counterbalanced by the efficiency of once-only library storage of referenced cells.
In the end we produce hugely complex constructions that need untangling by a sophisticated program. And though it’s true that the human mind does not work in this way, this approach to musical storage does, to some degree, mirror our ability to memorise simple repetitive and ‘predictable’ music more easily than the more complex and ‘unpredictable’ kind.
You might think of it as an aspect of ‘compression’, but surely what this approach does not capture is the predictability of most music, because of its genre, its key. The more predictability the less information needed to capture the options. For example, the notes that end a piece written in the classical genre don’t need to be written down at all. The final note (even chord) is obvious. This also, sometimes, applies to rhythm. An alien without musical training would probably need your complete notation, but most of the rest of us would not.
Also, it’s interesting to consider how much you can remove from a large-scale orchestral piece without detraction – meeting your criteria that the rendition should be considered a true representation by a listener. I used to leave out much of the second oboe part of Walton’s Belshazzar’s Feast, not only because I couldn’t play the notes, but partly because I knew no one would notice.
The rest of us (trained and experienced musicians) “would not need complete notation” because we might posses a vast accumulated mental library of rules and conventions that enables us to fill in the notational gaps to a high degree of accuracy. This skill is similar to that possessed by us all to some degree (though particularly well-developed in cleaning ladies) of being able to deduce the ends of sentences from what comes before. It’s a similar information process – the pared down musical notes or textual data being filled out ‘on the fly’ by retrieved contextual and cultural information. In both cases there’s a reliance on an unique sequential information set of the described musical or textual artefact itself, coupled with ’static’ referential information sets of background ‘matériel’. (These being two of the multiple layers of information involved in reproducing music – contextual, sequential, logical, referential, interpretational.) In fact not so dissimilar after all to layers in the ‘compression’ approach as originally described.
As for the notion that ‘more predictability’ means less information, here we are on slippery ground. Though true in a sense (the compression approach to measurement reflecting this), where does this lead us? Are we saying that the information volume might describe a piece of music only to a degree of probability? That a piece of music contains only the amount of information necessary to describe itself approximately? Take a couple of examples – the case of the leading note might work 99.9 percent of the time but that of chords is far more hazy. Any piano sight-reader knows that leading notes are easily anticipated, but that chords might be struck approximately, in the right key (so acceptable to the innocent ear), but arranged in terms of note spacing not exactly as the composer intended. Pitfalls abound – think of a leading note to an interrupted cadence or the nearly-but-not-quite repeated passage work of the Chopin Ballade. Slippery because the very art of musical expression could be said to be built upon treading the fine line between predictability and surprise. The unexpected is key.
The notion of probability in music is analytical (re-ative), and not descriptive (active), so cannot be part of the answer to the original question of “how much information a piece of music contains” or “how much information needs to be remembered in order to re-produce” (active)? In other words, we could analyse classical music statistically and come up with fascinating contextual probability tables, but in terms of calculating information volume (what is required to precisely describe) these would not be of much help. Taking poetry as a parallel, perhaps closer to music than factual text in its ambiguity – only the full, complete text can truly represent the poet’s intention. An inherently ambiguous construct requires unambiguous description.