PRC Recording Studio PRC Recording Studio PRC Recording Studio
Genealogy of my kin
This page has some arithmetic behind music and audio.

Sound pressure, sound level and dB

Decibles

Sound power is measuerd in decibels. (More accurately, in "bels" named for Alexander Graham Bell). The number of bels is defined as the common logarithm of the ratio of two powers. Most of the time, we use tenths of bels, called decibels or dB.

  • Smallest detectable change = 1 dB
  • A change of 3 dB is noticeable to most people
  • 10 dB seems twice as loud
       

The Power formula for decibels is
      dB = 10 log10 (P2/P1)

Since most audio engineers work the voltages, not power, and since power is proportional to V squared, and we can use V squared instead of p. From the definition of log functions,
          log10 (X2) = 2 log10 (X)

Thus
[1]  dB = 10 log10 (p2/p1)
= 10 log10 (V22/V12)
= 20 log10 (V2/V1)

Decibles in voltage

Sound is usually measured with microphones and they respond with voltage (approximately) proportional to the sound pressure, p. From equation [1] above:

      dB = 20 log10 (V2/V1) = 20 log (p2/p1)

When the decibel is used to give the sound level for a single sound rather than a ratio, then a reference level must be chosen. For sound intensity, the reference level (for air) is usually chosen as 20 micropascals, or 0.02 mPa. (This is very low: it is 2 ten billionths of an atmosphere. Nevertheless, this is about the limit of sensitivity of the human ear, in its most sensitive range of frequency.

Usually this sensitivity is only found in rather young people or in people who have not been exposed to loud music or other loud noises. Personal music systems with in-ear speakers ('walkmans') are capable of very high sound levels in the ear, and are responsible for much of the hearing loss in young adults in developed countries.)

The following example assumes a sound level of 86dB. Actually, that only makes sense relative to some other sound level. That level is often unstated.

So if you read of a sound intensity level of 86 dB, it means that

20 log (p2/p1) = 86 dB

where p1 is the sound pressure of the reference level, and p2 that of the sound in question. Divide both sides by 20:

log (p2/p1) = 4.3 dB

4 is the log of 10 thousand, 0.3 is the log of 2, so this sound has a sound pressure 20 thousand times greater than that of the reference level. 86 dB is a loud but not dangerous level of sound, if it is not maintained for very long.

The signal value recorded in PCM digital audio is simply the current value in decibels, stored as an integer. In a 16-bit audio format, we can represent a sinusoidally varying voltage audio signal by 216 or 65,536 discrete levels. By convention, the signal is signed. This means ±215, that there are 32,768 levels in each direction. (well, only 32,767 going negative).

Lets take a trivial example, one sound is twice as loud as the other. Lets make p1 = 1 and p2 = 2, then we have:

20 log (2/1) = 20 log (2)
= 20 * 0.301029996
= 6.020599913

So, we say that twice as loud, or one more bit of data, is a tad over 6.02 dB. Most people just round this down to 6 dB. Remember from above, most people recognize a sound of 10dB difference as twice as loud, and can reliably detect a change of 3dB.

A PCM recorder quantitizes the amplitude of the signal, while it samples at the specified frequency (usually 44.1 kHz). While quantizing, it determines the level in which the voltage for a given sample belongs. Error resulting from trying to represent a continuous analog signal with discrete, stepped digital data. The problem arises when the analog value being sampled falls between two digital "steps." When this happens, the analog value must be represented by the nearest digital value, resulting in a very slight error. In other words, the difference between the continuous analog waveform, and the stair-stepped digital representation is quantization error. For a sine wave, quantization error will appear as extra harmonics in the signal. For music or program material, the signal is constantly changing and quantization error appears as wideband noise, cleverly referred to as "quantization noise." It is extremely difficult to measure or spec quantization noise, since it only exists when a signal is present.

The magnitude of the quantitization error may never exceed the voltage represented by "half" of the least-significant bit (LSB) in the data word. A measurement of the error in a digitization system can be made, and it is expressed as the signal-to-error (S/E) ratio. This ratio is given by Eq. 3, where n is the number of bits in the data word.

[3]       S/E(db) = 6.02 n + 1.76

Hence, the theoretical S/E ratio for a 16-bit system is 98 dB. Keep in mind that this value is strictly theoretical and will be lowered and raised by many other performance parameters.

Values for commonly used digital audio systems are show in the next table:

Word Length in bits Calculated Signal to Error ratio Quoted Signal to Error ratio
16
98.08
96
17
104.1
102
18
110.12
108
19
116.14
114
20
122.16
120
21
128.18
126
22
134.2
132
23
140.22
138
24
146.24
144

 

For the most part, quantization error manifests itself as noise at high signal levels. However, quantization error becomes quite significant when a low-level signal approaches the level of the LSB, then the quantizing error actually becomes the signal, and therefore is an audible component of the output. In many types of music, these types of signals are common and distortion caused by quantization error is both unacceptable and irremovable. Fortunately, in practical systems this adverse effect can be effectively eliminated through the use of dither.

Further Reading

For more in the physics and signal processing of a Compact Audio disk (and other digital formats) see:


Minumum latency times.

In both directions, digital to analog conversion requires a fix mimimum time that depends simply on the number of samples being made.

The Benchmarkmedia.com site defines the formula

     delay = 1.01 ms + (48/Fs)

for their DAC-1 product. Other products, and other manufactures may have slightly different constants,but the basic formula will stay the same.

This yields an absolute minimum latency of 2.10 ms at 44.1kHz (the standard CD audio rate).


Data read rates

Ever wonder why computer CD readers are advertized as 12X or 45X? Ever wonder what "X" is? Here is the answer.

It all started with the CD, aka Compact Disk - Digital Audio standard, developed by Sony and Philips in 1980. The Red Book or system description CD-Digital Audio (CD-DA) specifies the CD Digital Audio disc format. The is the formal specification of the flat round plastic disk that replaced vinyl records, cassette tapes and lots more.

The RedBook specifies that stereo music is to be recorded in 16-bit PCM sampled at 44.1 kHz. A little arithmetic shows that the data stream of CD audio must be at least:

Data Rate = "X"
= 16 bits * 2 channels * 44.1kHz
= 16 * 2 * 44100
= 1,411,200 bits/second
= 176,400 bytes/second
= 172 Kbytes/second

So it is easy to see that to make CD audio, you need at about 1.5 Megabits per second. This is "X", 172KB per second. This also means that if you want to stream RedBook audio, you need to pass about 1.5 Mbs. This is much too much for a dial-up modem connection, but well within the capabilities of Ethernet, WiFi, and many broadband Internet connections.

An advertized "45X" CD-reader has to deliver 45 times as much data, or 7,938,000 bytes per second. This is a lot of data, more than can be delivered by a old 10MB ethernet or a WiFi 802.11b link.

More importantly, this indicates what level of compression was needed for streaming audio and real-time MP3 file transfers. The fastest dial-up modems claimed to be 56kb/s, but that was best case, and never really delived in the real world. Plus it is bits/second. A dial-up modem could deliver about 50,000 bits/second, or 6.2 KB/s, plus or minus a little for overhead, etc.

We would need nearly a 30 to one compression to deliver all the bits of Red Book audio in a MP3 file. It is not going to happen. You can get nearly 10 to one if you don't care much about quality.

MP3 file encoding rates have gradually risen in parallel with the improved speeds of broadband connections. In the late 1990s, 128 Kbps (kilobits per second) was considered the norm for good quality MP3 files. The standard specification for MP3s allows music to be ripped to bit rates from 8Kbps to 384 Kbps. At 384Kbs, the RedBook audio is only compressed about 4 to 1, and can sound very good.

Audio file data does not compress effectively using the standard LZW algorithm used by packages such as PKzip and WinZip. Lossless algorithms are needed to recreate the original data, just as PKzip/WinZip can exactly reproduce Word documents or other text files.

There are many lossless compression algorithms and file formats. These include:

  • FLAC - Free Lossless Audio Codec
  • Shorten
  • MLP - Meridian Lossless Packing (used for DVD-audio)
  • Lossless WMA (for windows)
  • Lossless AAC for Apple Macs
These lossless algorithms can achieve approximately a two to one compression.

Back to the main PRC Studios page.

 

Copyright © 2004-2005 Farrell and Associates.