Mathematics of the Western music scale

Table of contents

1 Sound waves
2 Harmonics, partials, and overtones
3 Harmony
4 Intervals
5 Just intonation
6 Equal temperament

6.1 Sound samples

7 Other scales
8 External link

Sound waves

Variations in air pressure against the ear drum give rise to the experience we call "sound". Most sound that people recognize as "musical" is dominated by periodic or regular vibrations rather than non-periodic ones, and we refer to the transmission mechanism as a "sound wave". In a very simple case, the sound of a "sine wave", which is considered to be the most basic, fundamental sound waveform, causes the air pressure to increase and decrease in a regular fashion, and is heard as a very "pure" tone. Pure tones can be produced by tuning forks. The rate at which the air pressure varies governs the "pitch" of the tone, and is measured in oscillations per second, or Hertz.

A spectrogram of . The bright lines along the bottom are the fundamentals of each note, and the other bright lines are (nearly) harmonic overtones.

Whenever two different pitches are played at the same time, their sound waves interact with each other - the highs and lows in the air pressure reinforce each other to produce a different sound wave. As a result, any given sound wave is composed of many different sine waves of different frequencies. The human hearing apparatus (composed of the ears and brain) can isolate these tones and hear them distinctly. When two or more tones are played at once, a single variation of air pressure at the ear "contains" the pitches of each, and the ear and brain isolate and decode them into distinct tones.

When the original sound sources are perfectly periodic, the note consists of several related sine waves (which mathematically add to each other) called the fundamental and the overtones. The lowest frequency present is the fundamental, and is the frequency that the entire wave vibrates at. The overtones vibrate faster than the fundamental, but must vibrate at integer multiples of the fundamental frequency in order for the total wave to be exactly the same each cycle. Real instruments are close to periodic, but the frequencies of the overtones are slightly imperfect, so the shape of the wave changes slightly over time.

Harmonics, partials, and overtones

The fundamental is the frequency at which the entire wave vibrates. Overtones are other sinusoidal components present at frequencies above the fundamental. All of the frequency components that make up the total waveform, including the fundamental and the overtones, are called partials.

Overtones which are perfect integer multiples of the fundamental are called harmonics.

When an overtone is near to being harmonic, but not exact, it is sometimes called a harmonic partial, although they are often referred to simply as harmonics.

Sometimes overtones are created that are not anywhere near a harmonic, and are just called partials or inharmonic overtones.

The fundamental frequency is considered the first harmonic and the first partial. The numbering of the partials and harmonics is then usually the same; the second partial is the second harmonic, etc. But if there are inharmonic partials, the numbering no longer coincides.

Overtones are numbered as they appear above the fundamental. So strictly speaking, the first overtone is the second partial (and usually the second harmonic).

As this can result in confusion, only harmonics are usually referred to by their numbers, and overtones and partials are described by their relationships to those harmonics.

Harmony

200 and 300 Hz waves and their sum, showing the periods of each

A spectrogram of a violin playing a note and then a perfect fifth above it. The shared partials are highlighted by the white dashes.

If two notes are simultaneously played, with frequency ratios that are simple fractions (e.g. 2/1, 3/2 or 5/4), then the composite wave will still be periodic with a short period, and the combination will sound consonant. For instance, a note vibrating at 200 Hz and a note vibrating at 300 Hz (a perfect fifth, or 3/2 ratio, above 200 Hz) will add together to make a wave that repeats at 100 Hz: every 1/100 of a second, the 300 Hz wave will repeat thrice and the 200 Hz wave will repeat twice.

Additionally, the two notes will have many of the same partials. For instance, a note with a fundamental frequency of 200 Hz will have harmonics at

(200,) 400, 600, 800, 1000, 1200, ...

A note with fundamental frequency of 300 Hz will have harmonics at

(300,) 600, 900, 1200, 1500, …

The two notes have the harmonics 600 and 1200 in common, and more will coincide further up the series.

The combination of composite waves with short fundamental frequencies and shared or closely related partials is what causes the sensation of harmony.

When two frequencies are near to a simple fraction, but not exact, the composite wave cycles slowly enough to hear the cancellation of the waves as a steady pulsing instead of a tone. This is called beating, and is considered to be unpleasant, or dissonant

Intervals

Musicians call the trivial case of a 1:1 ratio a "unison." More interesting is the 2:1 ratio. Any two pitches with a 2:1 ratio between them define a difference in frequency (or "interval") that is called an "octave". This is the smallest interval at which two different pitches will be perceived by the listener as being "the same note", because the higher note will add no new overtones to the composite waveform. The average human ear can perceive tones from about 20Hz at the low end to around 20,000Hz at the high end (though this number falls to less than 10000 with age). Starting at 20 and doubling up to 20,000 shows that the human ear has a range of about ten octaves.

There are clearly many other ratios of small integers, and even though they do not all avoid the generation of additional overtones, as does the octave, the hearing apparatus perceives any two notes with such a ratio (or close to it) to be "in tune", or in harmony.

Just intonation

A scale can then be defined as a set of notes and the corresponding intervals between each of those notes and the lowest one. The distances and number of notes vary, but in the majority of the western classical and popular tradition, twelve notes span a single octave, and the set of notes and ratios is copied onto all the other octaves.

Below is a twelve tone scale in just intonation:

0 1:1 unison 1 21:20 [semitone] 1 16:15 [semitone] / minor second 2 10:9 major second 2 9:8 major second 3 6:5 minor third 4 5:4 major third 5 4:3 perfect fourth 6 7:5 tritone / augmented fourth / diminished fifth 7 3:2 perfect fifth 8 8:5 minor sixth 9 5:3 major sixth 10 9:5 minor seventh 11 17:9 major seventh 12 2:1 octave

(In theory unisons and octaves and their multiples are also "perfect" but this terminology is rarely used.)

To obtain a scale of 12 notes the major tone 9:8 is equated with the minor tone 10:9 and to two semitones 256:225.

For purposes of tuning we need a reference pitch, something all the instruments can agree on. Usually a 440hz sine wave is used as the reference pitch, as an A natural. Now, according to our table above, we can calculate the pitch of any other note by setting up a simple ratio relationship. For example, if I wanted to calculate the pitch of a perfect fifth from an A440, I would write:

   (X / 440) = (3 / 2)

and solve for X. Simple algebra, right? In the above, X comes to 660. Let's calculate two more:

   (X / 440) = (9 / 8); major second = 495
   (X / 440) = (5 / 4); major third = 550

The note that a scale centers around is called the tonic. We often use the term "key" for a scale, so the key of A is just a scale with A as the tonic.

Now, what actual pitches do we end up with? If we pick A natural (440Hz) as the tonic, we have a scale containing the following frequencies/pitches:

440.000 A 462.000 A# 495.000 B 528.000 C 550.000 C# 586.667 D 616.000 D# 660.000 E 704.000 F 733.333 F# 792.000 G 831.111 G# 880.000 A

Any scale in which the ratio of any note to the tonic is an integer ratio is called a scale of just intonation. These scales have a very natural-sounding quality to them.

This is the common western scale of just intonation; other scales of just intonation exist, such as Indian raga scales.

Equal temperament

The problem with just intonation is that it is very difficult to achieve in any stopped or fretted instrument. The difficulty is subtle, but it means big headaches. For example, the interval of a major second is the "whole step" so common in the western tradition. It defines the distance between A and B, or C and D, among others. The interval of a major third defines the distance between two notes with two "whole steps" between them; for example, the distance between C and E or F and A.

If this is true, then the major second of a major second (that is, two whole steps from a given note) should equal the major third (two whole steps from a given note), or:

   (X / 495) = (9 / 8)

X should equal 550. But instead, X is 556.875. What has happened here? Well, A(440) was the initial tonic of the scale, and all the intervals we defined above meet the integer ratio condition. But then we took a different note (495) as the tonic, and computed the major second of that. So in effect, we used two different scales and found that after a whole step from the tonic in each case, we end up with a pitch that isn't in the other scale.

This is the problem. Any given scale of just intonation must be tuned to a tonic, which is fine if you only want to play in one scale or "key". However, you have to retune the instrument every time you modulate keys. As many classical composers (and pop ballad writers) will tell you, this has a way of limiting your expressive power.

So what's a keyboard manufacturer to do? The answer is simple: make one note in tune, and space all the other notes equally (logarithmically equally, anyway). This is what happens on most fretted instruments and keyboard instruments. Now, instead of calculating pitch with integer ratios, we just plug an interval into the following equation:

   P = 440 * 2^{n / 12}

where n is the number of half steps sharp you want to go (and hey, guess what, negative numbers work as expected; (n == -3) finds the pitch of the major sixth below A440). We call this approximation a scale of even (or equal) temperament, since the distance to any other note is independent of (and consistent across) key centers. The use of this scale was NOT pioneered by Bach. This throws everything very slightly out of tune. Observe:

Note Just Pitch E.T. Pitch (approx.) Error (%) A 440 440 0.0 Bb 462 466.16 +0.9 B 495 493.88 -0.2 C 528 523.25 -0.8 C# 550 554.37 +0.7 D 586.6- 587.33 +0.1 D# 616 622.25 +1.0 E 660 659.26 -0.1 F 704 698.46 -0.7 F# 733.3- 739.99 +0.9 G 792 783.99 -1.0

Ab 831.1- 830.61 -0.1

As you can see, we're never more than 1% out of pitch, which most people can't hear. However, you *are* out of tune inherently, so when your instrument then goes further out of tune you sound *really* bad. The advantage, though, is that you get stopped instruments, which makes composition and playing much easier.

In this system the fifth tone ratio is about 1.4983 instead of 1.5, and the half-tone ratio is 1.059463 instead of 1.05. Only the octave is still 2:1. It was not easy when they first learned to tune the "well tempered clavier" to interpolate between tuning suggested by different keys. Tuning done by ear cannot achieve a semitone ratio that matches the twelfth root of two in six or seven digits.

Many classical composers wrote compositions for just-intonated instruments (wind instruments in particular). However, since these instruments couldn't re-tune to a new tonic, modulating the key of the piece created a tension; it sounded like you were still playing in the original key and wanted to return to it. Some people insist that playing the piece on a JI instrument is the only way to truly hear what the composer intended. Other music fans disagree.

Sound samples

If you have a player capable of reading Vorbis files (for example Winamp 3), you can listen to the following samples demonstrating the difference between just intonation and equal temperament. You may need to play the samples several times before you can pick the difference.

- this sample has half a second at 550 Hz (C# in the just intonation scale), followed by half a second at 554.37 Hz (C# in the equal temperament scale).
- this sample consists of a "diad". The lower note is a constant A (440 Hz in either scale), the upper note is a C# in the just intonation scale for the first 1s, and a C# in the E.T. scale for the second 1s. Phase differences make it easier to pick the transition than in the previous sample.

Other scales

An alternative to having 12 notes logarithmically evenly spaced, is to allow the semitone to be slightly more than half a tone. This practice is known as Meantone temperament. If a semitone is made 0.6 tone logarithmically, you get a scale of 31 notes, which is in better approximation to just intonation than the 12 notes.

External link

Tuning for Beginners

This article (or a previous version thereof) was based on http://napalm.firest0rm.org/issue4.txt, used by permission of the original author (ajax).

See also: Joseph Schillinger