|
The Power formula for decibels is
dB = 10 log10
(P2/P1)
Since most audio engineers work the voltages, not power, and since
power is proportional to V squared, and we can use V squared instead
of p. From the definition of log functions,
log10
(X2) = 2 log10
(X)
Thus
|
[1] dB | = 10 log10
(p2/p1) |
|
= 10 log10 (V22/V12) |
|
= 20 log10 (V2/V1)
|
Decibles in voltage
Sound is usually measured with microphones and they respond with
voltage (approximately) proportional to the sound pressure, p. From
equation [1] above:
dB = 20 log10
(V2/V1)
= 20 log (p2/p1)
When the decibel is used to give the sound level for a single sound
rather than a ratio, then a reference level must be chosen. For sound
intensity, the reference level (for air) is usually chosen as 20 micropascals,
or 0.02 mPa. (This is very low: it is 2 ten billionths of an atmosphere.
Nevertheless, this is about the limit of sensitivity of the human
ear, in its most sensitive range of frequency.
The following example assumes a sound level of 86dB. Actually,
that only makes sense relative to some other sound level. That level
is often unstated.
So if you read of a sound intensity level of 86 dB, it means that
20 log (p2/p1)
= 86 dB
where p1 is the sound pressure of the reference level,
and p2 that of the sound in question. Divide both sides
by 20:
log (p2/p1)
= 4.3 dB
4 is the log of 10 thousand, 0.3 is the log of 2, so this sound
has a sound pressure 20 thousand times greater than that of the
reference level. 86 dB is a loud but not dangerous level of sound,
if it is not maintained for very long.
The signal value recorded in PCM digital audio is simply the current
value in decibels, stored as an integer. In a 16-bit audio format,
we can represent a sinusoidally varying voltage audio signal by
216 or 65,536 discrete levels. By convention, the signal
is signed. This means ±215, that there are 32,768
levels in each direction. (well, only 32,767 going negative).
Lets take a trivial example, one sound is twice as loud as the
other. Lets make
p1 = 1
and p2 = 2,
then we have:
|
20 log (2/1) | = 20 log (2) |
| = 20 * 0.301029996 |
| = 6.020599913 |
So, we say that twice as loud, or one more bit of data, is a tad
over 6.02 dB. Most people just round this down to 6 dB. Remember
from above, most people recognize a sound of 10dB difference as
twice as loud, and can reliably detect a change of 3dB.
A PCM recorder quantitizes the amplitude of the signal, while it
samples at the specified frequency (usually 44.1 kHz). While quantizing,
it determines the level in which the voltage for a given sample
belongs. Error resulting from trying to represent a continuous analog
signal with discrete, stepped digital data. The problem arises when
the analog value being sampled falls between two digital "steps."
When this happens, the analog value must be represented by the nearest
digital value, resulting in a very slight error. In other words,
the difference between the continuous analog waveform,
and the stair-stepped
digital representation is quantization error. For a sine wave, quantization
error will appear as extra harmonics in the signal. For music or
program material, the signal is constantly changing and quantization
error appears as wideband noise, cleverly referred to as "quantization
noise." It is extremely difficult to measure or spec quantization
noise, since it only exists when a signal is present.
The magnitude of the quantitization error may never exceed the
voltage represented by "half" of the least-significant bit (LSB)
in the data word. A measurement of the error in a digitization system
can be made, and it is expressed as the signal-to-error (S/E) ratio.
This ratio is given by Eq. 3, where n is the number of bits
in the data word.
[3]
S/E(db) = 6.02 n + 1.76
Hence, the theoretical S/E ratio for a 16-bit system is 98 dB.
Keep in mind that this value is strictly theoretical and will be
lowered and raised by many other performance parameters.
Values for commonly used digital audio systems are show in the
next table:
| Word Length in bits |
Calculated Signal to Error ratio |
Quoted Signal to Error ratio |
|
16
|
98.08
|
96
|
|
17
|
104.1
|
102
|
|
18
|
110.12
|
108
|
|
19
|
116.14
|
114
|
|
20
|
122.16
|
120
|
|
21
|
128.18
|
126
|
|
22
|
134.2
|
132
|
|
23
|
140.22
|
138
|
|
24
|
146.24
|
144
|
For the most part, quantization error manifests itself as noise
at high signal levels. However, quantization error becomes quite
significant when a low-level signal approaches the level of the
LSB, then the quantizing error actually becomes the signal,
and therefore is an audible component of the output. In many types
of music, these types of signals are common and distortion caused
by quantization error is both unacceptable and irremovable. Fortunately,
in practical systems this adverse effect can be effectively eliminated
through the use of dither.
Further Reading
For more in the physics and signal processing of a Compact Audio
disk (and other digital formats) see:
Minumum latency times.
In both directions, digital to analog conversion requires a fix
mimimum time that depends simply on the number of samples being
made.
The
Benchmarkmedia.com
site defines the formula
delay = 1.01 ms + (48/Fs)
for their DAC-1 product. Other products, and other manufactures
may have slightly different constants,but the basic formula will
stay the same.
This yields an absolute minimum latency of 2.10 ms at 44.1kHz (the
standard CD audio rate).
Data read rates
Ever wonder why computer CD readers are advertized as 12X or 45X? Ever wonder what "X" is? Here is the answer.
It all started with the CD, aka Compact Disk - Digital Audio standard,
developed by
Sony and Philips
in 1980.
The Red Book or system description CD-Digital Audio (CD-DA)
specifies the CD Digital Audio disc format.
The is the formal specification of the flat round plastic disk that
replaced vinyl records, cassette tapes and lots more.
The RedBook specifies that stereo music is to be recorded
in 16-bit PCM sampled at 44.1 kHz. A little arithmetic shows that
the data stream of CD audio must be at least:
| Data Rate | = "X" |
| = 16 bits * 2 channels * 44.1kHz |
| = 16 * 2 * 44100 |
| = 1,411,200 bits/second |
| = 176,400 bytes/second |
| = 172 Kbytes/second |
So it is easy to see that to make CD audio,
you need at about 1.5 Megabits per second. This is "X", 172KB per second.
This also means that if you want to stream RedBook audio,
you need to pass about 1.5 Mbs.
This is much too much for a dial-up modem connection,
but well within the capabilities of Ethernet, WiFi, and many
broadband Internet connections.
An advertized "45X" CD-reader has to deliver 45 times as
much data, or 7,938,000 bytes per second.
This is a lot of data, more than can be delivered by a old 10MB
ethernet or a WiFi 802.11b link.
More importantly, this indicates what level of compression was needed
for streaming audio and real-time MP3 file transfers. The fastest
dial-up modems claimed to be 56kb/s, but that was best case,
and never really delived in the real world. Plus it is bits/second.
A dial-up modem could deliver about 50,000 bits/second, or 6.2 KB/s,
plus or minus a little for overhead, etc.
We would need nearly a 30 to one compression to deliver all the
bits of Red Book audio in a MP3 file. It is not going to happen.
You can get nearly 10 to one if you don't care much about quality.
MP3 file encoding rates have gradually risen in parallel
with the improved speeds of broadband connections.
In the late 1990s, 128 Kbps (kilobits per second) was considered
the norm for good quality MP3 files. The standard
specification for MP3s allows music to be ripped to
bit rates from 8Kbps to 384 Kbps.
At 384Kbs, the RedBook audio is only compressed about 4 to 1,
and can sound very good.
Audio file data does not compress effectively using the
standard LZW algorithm used by packages such as PKzip and WinZip.
Lossless algorithms are needed to recreate the original data, just as
PKzip/WinZip can exactly reproduce Word documents or other text
files.
There are many lossless compression algorithms and file formats. These include:
- FLAC - Free Lossless Audio Codec
- Shorten
- MLP - Meridian Lossless Packing (used for DVD-audio)
- Lossless WMA (for windows)
- Lossless AAC for Apple Macs
These lossless algorithms can achieve approximately a two to one compression.
Back to the main PRC Studios page.
|