This preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Unformatted text preview: 4 Sound Waves and Sine Waves John .P/‘erce Sound and Sine Waves We are immersed in an ocean of air. Physical disturbances—snap—
ping the ﬁngers, speaking, singing, plucking a string, or blowing a
horn—set up a vibration in the air around the source of sound. A
sound wave travels outward from the source as' a spherical wave
front. It is a longitudinal wave in which the pulsating motion of the
air is in the direction the wave travels. In contrast, waves in a
stretched string are transverse waves, for the motion of the string is
at right angles to the direction in which the wave travels. How fast does a sound wave travel? If the air temperature is 20
degrees Celsius, a sound wave travels at a velocity of 344 meters
(1128 feet) a second——a little faster at higher temperatures and a
little slower at lower temperatures. Sound travels in helium almost
three times as fast as in air, and longitudinal sound waves can travel
through metals and other solids far faster. The sound waves that travel through the air cause components of
our ears to vibrate in a manner similar to those of the sound source.
What we hear grows weaker with distance from the source, because
the area of the spherical wavefront increases as the square of the
distance from the source, and the power of the source wave is spread
over that increasing surface. What actually reaches our ears is com
plicated by reﬂections from the ground and other objects. In a room,
much of the sound we hear comes to our ears after being reﬂected
from floor, walls, and ceiling. The vibrations of musical sound are complicated, and the charm
of musical sounds lies in their complexity. But most oldtime dis
cussions of musical sounds and most oldtime experiments with
sound waves and with hearing were carried out with a rare and sim
ple sort of sound wave, a sinusoidal wave. How can such discus
sions and experiments have any relevance to the complicated
sounds of music? Chiefly, because the phenomenon of sound propa—
gation in air at intensities encountered in musical performance is
a linear phenomenon. The undisturbed vibrations of strings or of
columns of air are at least approximately linear. Even the vibrations Iohn Pierce 38 along the cochlea of the ear are close enough to linear for linear
systems ideas to be an appropriate guide to thought. What are the
characteristics of linear systems? How can sine waves be useful
in connection with the complicated sounds of music? 4.2 Linear Systems
Sine waves are important both mathematically and practically in de scribing the behavior of linear systems. What is a linear system? The ampliﬁer depicted in ﬁgure 4.1 illus~
trates a linear system. Suppose that an input signal or waveform In1
produces an output waveform Outl, and that an input waveform In2
produces an output waveform OutZ. If the ampliﬁer is linear, the combined input waveform In1+1n2 will produce an output wave—
form Out1+Out2. The output of a linear ampliﬁer [or of any linear i
system or phenomenon) for a sum of inputs is the sum of the outputs t
produced by the inputs separately. It may be easier to understand if we say that an ampliﬁer is linear
if it doesn’t produce any distortion. In some real ampliﬁers there is
distortion. We hear things in the output that were not present in the
input. Mathematically, a linear system is a system whose behavior
is described by a linear differential equation or by a linear partial
differential equation. In such an equation the sum of constants times
partial derivatives with respect to time and space is equal to 0, or to
an input driving function. Some linear, or approximately linear, sys
tems are the following: A sound wave in air [linear for musical intensities) A Vibrating string [linear for small amplitudes) A Vibrating chime or bell [ordinarily linear) The bones of the middle ear [linear for small changes in level]
Vibrations along the basilar membrane of the cochlea [with some assumptions). A linear
lnl Outl
Same
lnl+ Same 0utl+
"‘2 SVStem Out2 Figure 4.1 A system is linear it the output due to two overlapping inputs is the sum of the outputs to
each input separately. Pierce 38 r linear
are the
useful ' in de l illus
rm ln1
rm InZ
1r, the
wave
linear
1tputs linear
ere is
n the
avior
artial
,imes
or to sys ome sto 4.3 Sine Waves 4. Sound Waves and Sine Waves 39 Tamtams, cymbals, and some other percussion instruments ex
hibit a clearly nonlinear phenomenon: an upwelling of high frequen
cies after striking. Smaller nonlinearities in nearly all musical
instruments are responsible for subtle but characteristic musical
qualities of the sounds produced. But the most obvious features of
the sounds of conventional instruments are consistent with an as
sumption of linearity. To the degree to which an instrument such as a piano. guitar, bell,
or gong is linear, the vibrations it produces can be represented as a
sum of slowly decaying sine waves that have different frequencies.
Each frequency is associated with a particular spatial distribution of
vibrations and has a particular rate of decay. The sound of the wave
generated by such a sum of vibrations at different frequencies consti
tutes a musical tone. . The frequencies of free vibrations of a violin string or the air in a
horn predispose the forced [by bowing or blowing) vibrations to
have frequencies quite close to those of a free vibration. Skillful
bowing of a violin string can give harmonics, which are integer mul
tiples of some fundamental frequency. A bugle can be blown so as
to produce a rather small number of musical tones, each near a fre
quency of the free vibration of the air in the tube, again a series of
harmonics. Because sine waves, and measurements based on sine waves, are
pervasive in musical lore, it is important at this point to become
well acquainted with a sine wave. Figure 4.2 shows a swinging pen
dulum that traces out a portion of a sine wave on a moving strip of
paper. A true sine wave lasts forever, with its past, present, and fu
ture an endless repetition of identical periods or cycles of oscil—
lation. A sine wave can be characterized or described completely
by three numbers: the maximum amplitude [in centimeters, volts,
sound pressure, or some other unit of measurement), the frequency
in Hertz (Hz, cycles per second), and the phase, which speciﬁes the
position when the sine wave reaches its peak amplitude. This is il
lustrated in figure 4.3. With respect to phase, we should note that the mathematical co
sine function is at its peak when the phase is 0 degrees, 360 degrees,
720 degrees, and so on. The mathematical sine function reaches its
peak at 90 degrees, 450 degrees, and so on. Magma. m» John Pierce 40 Swings through j}
small angle 1 Penod : Frequency Figure 4.3 Sine waves are described completely by theirfrequency (or period), amplitude, and phase. The relative amplitudes of sine waves are often expressed in terms
of decibels [dB]. If wave 1 has a peak amplitude of vibration of A1
and a reference wave or vibration has a peak amplitude of vibration
of A2, the relationship in decibels of vibration A1 to vibration A2 is
given by 20100 0'] U [AI/A2). [4.1] A sound level in decibels should always be given as decibels
above some reference level. Reference level is often taken as a sound
power of a millionth of a millionth of a watt per square meter. A
person with acute hearing can hear a 3000 Hz sine wave at reference
level. Reference level is sometimes also taken as a sound pressure of
0.00005 newtons. which is almost exactly the same reference level
as that based on watts per square meter. In many experiments with sound we listen to sounds of different
frequencies. It seems sensible to listen to sinusoidal sound waves in
an orderly fashion. We will use the diagram shown in figure 4.4 to a.” _ ,1, m, w mam, W . e40 4. Sound Waves and Sine Waves 41 Sound pressure
level [(13) Musical note
140 A0 A1 A2 A3 A4 A5 A6 A? A8 120
100
80
60 40 20 §
II—
E:
==
==
‘
a.
3 Frequency
(HZ) 20 50 100 200 500 1k 2k 5k 10k Figure 4.4 The equal loudness curves link combinations of sound pressure level and frequency that
are heard as equally loud. guide our listening. Here frequency in Hertz is plotted horizontally.
Nine vertical lines are shown, spaced one octave apart at frequencies
from 27.5 Hz (A0], the pitch frequency of the lowest key on the pi—
ano keyboard, to 7040 Hz [A8), above the topmost piano key. The curves shown are equal loudness curves. Along a particular
loudness curve the various combinations of frequency and level give
sounds that are judged to have the same loudness. The constant
loudness curves crowd together at low frequencies. At low frequen‘
cies, a small change in amplitude results in a large change in loud
ness. There is some crowding together at about 4000 Hz. We can listen to tones at a chosen frequency given by one of the
vertical lines at six different amplitudes, each successively 10 dB
below the preceding amplitude. This tells us how a sinusoidal
sound of a given frequency sounds at six sound levels 10 dB apart.
Of course. the sine wave sounds fainter with each 10 dB decrease in
amplitude. What we hear depends on the initial sound level. and
that depends on the audio equipment and its loudness setting. But,
roughly, this is what we hear: At 27.5 Hz, a weak sound that disappears after a few 10 dB falls
in level. The constant loudness curves are crowded together at this MW».._..V..~ . .i. John Pierce 42 low frequency, and a few 10 dB decreases in amplitude render the
sound inaudible At 110 Hz, a stronger sound that we hear at all successively lower
sound levels At 440 Hz, the pitch to which an orchestra tunes, a still stronger
sound At 1760 Hz, a still stronger sound At 7040 Hz, a somewhat weaker sound. With increasing age people
tend to hear high—frequency sounds as very weak, or not to hear
them at all. 4.4 Sine Waves and Musical Sounds Figure 4.5 One importance of sine waves is that for linear oscillating systems,
the overall vibration of a musical instrument can be regarded as the
sum of sinusoids of different frequencies. This is illustrated in ﬁgure
4.5, which shows several patterns of oscillation of a vibrating string. In the vibration at the top, the deviation of the string from
straightness varies sinusoidally with distance along the string. The
center of the string vibrates up and down, with a sinusoidal dis
placement as a function of time, and the oscillation falls smoothly
to O at the ends. At any instant the variation of displacement with
distance along the string is sinusoidal. We can think of the oscilla—
tion of the string as corresponding to a traveling sine wave of twice
the length of the string, reﬂected at the ﬁxed ends of the string. We Some modes of vibrations of a stretched string. Different modes have different numbers
of loops: from top to bottom, here, one, two, three. The frequencies of vibration are pro
portional to the number of loops. ignyﬂgmu. WM”. V. r .. .... 12 IE 31‘ 31“ [1‘ (0:51“ D—‘<1 v; 4. Sound Waves and Sine Waves 43 can describe this pattern of oscillation as having one loop along the
string. Below we see patterns of vibration with two and three loops
along the string. In agreement with the original observations of Pythagoras as inter
preted in terms of frequency of vibration, the frequencies of the vari
ous patterns of vibration are proportional to the number of loops
along the string. Thus, if f0 is the frequency for vibration at the top
of ﬁgure 4.5, the frequencies of vibration shown lower are 2fU (two
loops] and 3fU (three loops]. Other “modes” would have frequencies
of 4fU [four loops], 5f0, and so on. Ordinarily, when we excite the string by plucking or striking, we
excite patterns of vibration at many different frequencies that are '
integers [whole numbers] times the lowest frequency. In one period
of duration, 1/f0, the various other harmonic frequencies of oscilla
tion, corresponding to two, three, four, five, and so on loops, will
complete two, three, four, ﬁve, and so on oscillations. After the pe
riod 1/fU, the overall oscillation will repeat again, “endlessly” in an
ideal case of no decay in amplitude. We have considered various aspects of sine waves that we hear.
Wavelength is an aspect of sinusoidal sound that is associated with
a sound wave traveling through air. The wavelength of a sinusoidal
sound is sound velocity divided by frequency. As noted in section
4.1, the velocity of sound in air is 344 meters/second (1128 feet/
second]. In table 4.1, wavelength is tabulated for various frequencies
[and musical pitches]. We see that in going from the lowest key on
the piano, A0 [frequency 27.5 HZ] to A7 (the highest A on the key—
board [frequency 3520 Hz], the wavelength goes from 41 feet [12.5
meters] to 0.32 foot [0.1 meter]. Actual musical tones include har
monics whose wavelengths are much shorter than that of the funda—
mental or pitch frequency. For some musical instruments [including some organ pipes and
the clarinet], the sounds produced contain chieﬂy odd harmonics of
a fundamental frequency. This happens whenever one end of a tube
is closed and the other end is open. If f0 is the fundamental fre—
quency of a closed organ pipe, the chief frequencies present are f0,
3fU, 5f0, 7f”, and so on. We can represent the sustained sound of a musical instrument
by a sum of sine waves with many harmonic frequencies. But we
hear the sound as a single musical tone with a pitch that is given by
the pitch frequency, the frequency of which the frequencies of all
the partials are integer multiples. The pitch of a musical sound de Iohn Pierce 44 Table 4.1 Musical notes, frequencies, and wavelengths NOTE NAME FREQUENCY (HZ) WAVELENGTH (FT.)
A0 27.5 41 A1 55 205 A2 110 10.25 A3 220 5.1 A4 440 2.56 A5 880 1.28 A6 1760 0.64 A7 3520 0.32 pends on the simple harmonic relation among the many frequencies
present. The musical quality of the overall sound depends in part
on the relative intensities of the various harmonics, and in part on
how they are excited initially (the attack quality of the sound). (We
will discuss the topics of pitch and quality further in later chapters.) 4.5 Fourier Analysis Most musical instruments produce sounds that are nearly periodic.
That is, one overall cycle of the waveform repeats, or nearly repeats,
over and over again. Looking at this in another way, traditional mu—
sical tones, of the voice or of an instrument, are periodic, or nearly
periodic. Hence. it is pertinent to consider the general qualities of
periodic sounds. Any periodic waveform can be approximated by a
number of sinusoidal components that are harmonics of a funda
mental frequency. That fundamental frequency may or may not be
present in the sound. It is the reciprocal of the period of the wave—
form measured in seconds. This is illustrated in ﬁgure 4.6 by three approximations of a saw
tooth waveform. In approximating a sawtooth waveform we add har—
monicrelated sine waves whose frequencies are f0, 2f“, 3f0 and so on,
and whose amplitudes are inversely proportional to the frequencies.
Three sine waves give a very poor approximation to a sawtooth. A
better approximation is given by 6 sinusoidal components, and a
still better approximation by 12. 44 es
art on
We i?
J
It? 4. Sound Waves and Sine Waves 45 Figure 4.6 Representation of a sawtooth wave as the sum of one, two, and three sinusoids. A true sawtooth waveform is a succession of vertical and slanting
straightline segments. A Fourier series approximation to a true saw
tooth waveform that uses a ﬁnite number of harmonically related
sine waves differs from the sawtooth waveform in two ways. In a
gross way, the approximation gets better and better as we include
more and more terms. But there is a persistent wiggle whose ampli
tude decreases but whose frequency increases as we add more and
more terms. We'will see later that the ear can sometimes hear such a
wiggle, as well as a pitch associated with a true sawtooth waveform.
Remember, from the equal loudness contours. that we can hear only
up to a given frequency, so if we add enough harmonic sinusoids to
our approximation of any wave, we can get perceptually as close as
we like. A fitting of the sum of harmonically related sine waves to a peri
odic waveform is called Fourier analysis. Mathematically, the Fourier series transform is defined by the equations V“) Z i CHBI‘ZmHI’I‘ (4.2)
1 I' .7 v C“ : ?J—TT/22 Vitje/mnt/rdt. These describe the representation of a periodic time waveform v(t)
in terms of complex coefficients Cn that represent the phases and r r John Pierce 46 amplitudes of the harmonic sinusoidal components [4.2], and the
expression for ﬁnding the coefﬁcients Cu from the waveform signal
v[t) [4.3). The coefﬁcients Cn are found by integrating over the period
(T) of the waveform. What about waveforms that aren’t periodic? The equations Vlt) = I: thle’”"‘”df [4.4)
Vlf) = V[t)e"2"””Tdt (4.5) give expressions for an arbitrary, nonperiodic waveform in terms of
a complex sound spectrum V[f) that has frequencies ranging from
minus inﬁnity to plus inﬁnity, and an integral that, for a given wave—
form v[t], gives the complex spectral function V[f]. Such an overall
resolution of a complete waveform into a spectrum is of limited use
in connection with music. For example, we could in principle ﬁnd
the spectrum of a complete piece of music. This would tell us very
little that we would care to know. Today, most Fourier analyses of
waveforms are performed by computer programs, using a discrete
deﬁnition of the Fourier transform. It is important to note that a waveform, that is, a plot one cycle
long of amplitude versus time, is a complete description of a peri
odic waveform. A spectrum gives a complete description of a wave—
form, consisting of two numbers for each single frequency. These
two numbers can describe the real and imaginary parts of a complex
number, or they can describe the amplitude and phase of a particular
frequency component. Conversion back and forth from complex
number representation to amplitude and phase representation is ac—
complished simply. In plots of spectra of sound waves. the phase of the spectral com—
ponents is seldom displayed. What is plotted against frequency is
usually how the amplitude varies with frequency. The amplitude is
often given in decibels. Or the square of the amplitude is plotted
versus frequency. This is called a power spectrum. Is the phase of a Fourier component important? Figure 4.7 shows
4 periods of waveforms made up of 16 sinusoidal components with
harmonic frequencies [fw 2f“, 3f“, etc.) having equal amplitudes
but different phases. The waveforms look very different, The top
most waveform is a sequence of narrow spikes with wiggles in be—
tween. In the center waveform the phases have been chosen so as to
make each repeated cycle of the waveform look like a sinusoid of
decreasing frequency. also called a chirp. In the waveform at the 46 1e
al )d ill
se
1d W
of
te Figure 4.7 4. Sound Waves and Sine Waves 47 16 sine waves
equal amplitudes
and phases l6 sine waves .
equal amplitudes ‘\ '
Schroeder phases lﬁ sine waves l r . , , "l . «
equal amplitudes 5’1 if I n ‘ if lwlﬁvll )5 n a“; l Writ ,i‘l ,all 7%" j
random phases ll 'N' l w} l \ a ‘* (,1 will " 'l h e" “ “I ll fawn.  The effect of phase on waveform. Sixteen harmonically related sine waves of equal ampli
tude make up the three waveforms, with the only difference being phase. bottom, the relative phases were chosen at random, and the wave—
form looks like a repeating noise. Although the amplitude spectrum
is the same for all three waveforms, the phase spectra are different
and the waveforms look very different. These three different wave—
forms sound different at 27.5 Hz with headphones. At 220 HZ the
sounds scarcely differ with headphones. At 880 Hz there is no differ
ence in sound. In a reverberant room, differences are small even at
27.5 Hz. Partly because we don't listen through headphones, and partly be—
cause most pitches are higher than 27.5 Hz, most plots of spectra
take no account of phase. It can be important to know how the frequency content of a musi
cal sound changes with time. Many sustained musical sounds have
small, nearly periodic changes of amplitude (tremolo) or of fre—
quency (vibrato). And there are attack and decay portions of musical
sounds. As an example of the importance of this, JeanClaude Risset
and Max Mathews found in 1969 that in the sounds of brassy instru—
ments, the higher harmonics rise later than the lower harmonics.
This is useful, indeed necessary, in synthesizing sounds with a
brassy timbre. How can we present a changing spectrum in a way
that is informative to the eye? One way of representing changing
spectra is to plot successive spectra a little above and to the right of
one another, so as to give a sense of perspective in time. Figure 4.8
shows successive spectra of a sine wave with a little vibrato that
shifts the peak a little to the left, then back, repeating this pattern
periodically. There is another way ofrepresenting changing spectra, a represen
tation by sonograms (also called spectrograms). This is particularly
valuable in studying very complicated sounds such as speech. A so
nogram of speech is shown in ﬁgure 4.9. The amplitude at a given Iohn Pierce 48 0.3 Frequency
(HZ) 0.0 0 1000 2000 3000 4000 Figure 4.8 A “waterfall spectrum” representation of a sinusoidal sound with a slow sinusoidal varia
tion ottrequency with time. frequency is represented by darkness (pure white represents zero
amplitude). Distance from the bottom represents frequency. Time is
plotted left to right. The two sonograms are of the same speech
sound. In the upper sonogram, resolution is good in the frequency
direction—we can see individual harmonic tracks—but it is blurred
in the time direction. In the lower sonogram the resolution is good
in the time direction—we can see individual pitch periods repre
senting the vibrations of the vocal folds—but it is fuzzy in the fre
quency direction. Resolution can’t be sharp in both directions. If we
want precise pitch, we must observe the waveform for many periods.
If we want precise time, we must observe the waveform for only part
of a period. In general, the product of resolution in frequency and resolution in time is constant. This is a mathematical limitation that
has nothing to do with the nature of the sound source. Fourier analysis, the representation of a periodic waveform in
terms of sine waves, is an essential tool in the study of musical sound. It allows us to determine the frequency components of a sound and to determine how those components change with time. Is the waveform or the spectrum better? If you are looking for a weak 48 Figure 4.9 4. Sound Waves and Sine Waves 49 03 Time (s) Spectrograms in which amplitude or intensity is represented by degree of darkness. reﬂection following a short sound (as in radar), the waveform is bet
ter. But suppose you want to find the sound of a tin whistle in the
midst of orchestral noise. You may have a chance with a spectral
analysis that sharply separates sound energy in frequency. You
won’t have a chance by just looking at the waveform. 80 both wave
forms and spectra are legitimate and useful ways of depicting
sounds. What we actually do in Fourier analysis of musical sounds is to
use a computer program, called a fast Fourier transform (FFT). The
analysis produces a spectrum that gives both amplitude and phase
information, so that the waveform can be reconstructed from the
spectrum obtained. Or the amplitude alone can be used in a spectral
plot. Of an actual sound wave, we take the spectrum of a selected or
Windowed portion of a musical sound that may be several periods
long. Figure 4.10 illustrates the process of windowing. At the top are a
few periods of a sine wave. In the center is a windowing function.
This is multiplied by the overall waveform to give the windowed
portion of the waveform, shOWn at the bottom. In analyzing the
waveform, a succession of overlapping windows is used to find out WNMWmQWWWWrns w x « V Figure 4.10 John Pierce 50 Wave Win dowed
wave Windowed time function. Top, time function, center, time window function, bottom, win
dowed time function whose Fouriertransform is to be taken. how the spectrum varies with time. This is the way the data were
prepared for constructing the waterfall and sonogram plots of ﬁgures
4.8 and 4.9. A strict reconstruction of the waveform from the spectrum ob
tained from any one window would repeat over and over again, but
such a reconstruction is never made. In constructing the variation
of the spectrum with time, or in reconstructing the waveform from
the spectra of successive windowed waveforms, each windowed
waveform is limited to the time duration of the window. Such a
reconstruction necessarily goes to 0 at the ends of a particular
window, where the window and the windowed waveform go to 0.
The analysis of a succession of overlapping windowed waveforms
makes it possible to construct an overall spectrum that varies with
time, and from this overall spectrum the waveform itself can be
reconstructed. Fourier analysis is a mathematical verity. It is useful in connection
with musical tones because the ear sorts sounds into ranges of fre
quency, and tampering with the sound spectrum has clear effects on
what we hear and identify. Consider the sound of a human voice. If
we remove or ﬁlter out low frequencies, the sound becomes high
and tinny, but its musical pitch does not change. If we filter out
the high frequencies, the voice becomes dull. Nonperiodic fricative
(noise) components of a sound are identiﬁed through the higher fre
quencies of their spectra. If we ﬁlter out the higher frequencies, we
can’t tell f [as in fee) from 5 (as in see). 4.6 erce 50 4. Sound Waves and Sine Waves 51
4.6 The Sampling Theorem Fourier analysis allows us to represent a periodic waveform in terms of sine waves whose frequencies are harmonics of some fundamen / u tal frequency, the lowest frequency present in the representation. More generally, any waveform, periodic or not, can be represented
by its spectrum, a collection of sine waves. Mathematically, the
spectrum and the waveform itself are alternative ways of describing _ the same signal. We can think of the same signal either as a time waveform or as a spectrum, a collection of sine waves. _ 7, All spectral representations of sound waveforms are band limited. That is, somewhere along the line all frequency components that lie outside of a prescribed range of frequencies have been eliminated, have been ﬁltered out by a circuit that will pass only frequency com— 'ywm‘ ponents that lie in some delimited bandwidth. In musical appli cations this bandwidth extends from quite low frequencies to a
_ frequency of tens of thousands of Hertz. were ’ The sampling theorem tells us that a band—limited waveform of tures ' bandwidth B Hz can be represented exactly, and (in principle) can
i be reconstructed without error from its amplitudes at 2B equally Ob‘ I. spaced “sampling times” each second. For example, 20,000 sample ' bUt ; amplitudes a second completely describe a waveform of bandwidth HOD ': 10,000 Hz. In sampling and reconstruction of signals, any compo rom 'y nent of frequency B+f (f is some frequency] will give rise to sample wed amplitudes that will produce, in the output after ﬁltering, a compo Ih '3 nent of frequency B—f. This phenomenon of the presence of false “Mr I frequency components in the output is called aliasing. O 0 Figure 4.11 illustrates the process of sampling a continuous wave— Fms form. At the top we have a waveform that contains no frequencies Vith greater than some bandwidth B. We sample this waveform 2B times be a second and transmit or store successive samples as the amplitudes, represented in the drawing by the amplitudes of the short pulses in
ion the lower part of the ﬁgure. fre‘ To reconstruct a sampled waveform, we turn each received sample on amplitude into a short pulse with an amplitude proportional to the " If amplitude of the sample. We ﬁlter the sequence of pulses with a
igh ﬁlter that passes no frequencies higher than B. The ﬁlter output is a
jut faithful representation of the original signal that was sampled 2B
iVe times a second. For this process to work, the signal that is sampled
Te' must contain no frequencies greater than B. That means that a ﬁlter
we with an inﬁnitely sharp cutoff must be used. Finally, the phase shifts of all ﬁlters must be strictly proportional to frequency. John Pierce 52 Amplitude Amplitude Fi Figure 4.11 A waveform of bandwidth B (upper) can be sampled (lower), and recovered exact/yfrom
28 samples (numbers representing the amplitude) per second. The waveform is recovered from the samples by filtering (smoothing). Such unrealistic ﬁlters can’t be made or used for many reasons.
One involves the time—bandwidth constant described earlier in this
chapter. Filters with inﬁnitely steep cutoffs would require inﬁnite
time to implement. Since the bandwidth can’t be made strictly equal
to B, the aliasing components must be reduced somehow. The
“cure” is to limit thebandwidth to somewhat less than half the
sampling rate, so as to reduce the effect of aliasing rather than
to eliminate it entirely. Thus, an actual sampling rate of 44,100
samples a second (the compact disc sampling rate] is used to attain
a bandwidth of around 20,000 Hz rather than the ideal bandwidth
Of 22,050 HZ. In actual systems employing sampling, the sample amplitudes are
represented as digital numbers. The amplitudes are speciﬁed by
groups of binary digits. As many as 21 such digits are in commercial
use (Deutsche Gramaphon). In standard compact disc recording, the
accuracy of representation of sample amplitudes is commonly 16
binary digits, which gives a signaltonoise ratio of around 90 dB. 4.7 Filter Banks and Vocoders
In the compact disc system the whole bandwidth of an audio signal is encoded by means of one set of samples. However, ﬁlters can be
used prior to sampling to break an audio signal of bandwidth B into
N adjacent frequency channels, each of bandwidth B/N, as indicated
in ﬁgure 4.12. These channels could in principle be separated
sharply, but it is also permissible that adjacent frequency bands ‘ 7“wa w.v wumww». , ‘ ‘ WNVWI .nw—rw. 'ierce 52 f 4. Sound Waves and Sine Waves 53 Amplitude Trafsmrm
/'
e, Wavelet
lme
Filter bank Frequency
e
ime Figure 4.12 A filter bank divides the signal into overlapping frequency bands; the sum of these band
limited signals is the original signal.
gig/V238”; overlap in any way such that the sum of the outputs of the overlap
ping ﬁlters gives the original signal.
It turns out that the overall number of samples per second needed
LSOHS. to describe an ideal multichannel signal is simply 2B. ZB/N samples
1 this per second are allotted to encode each channel of bandwidth B/N.
ﬁnite For reasons historical rather than logical, a system that encodes 3
equal waveform of bandwidth B into N channels of bandwidth B/N [and
The then recombines the channels to get the original signal) is called a
f the phase vocoder.
than We note in ﬁgure 4.12 that the ﬁltered channels of successively
L100 increasing frequency constitute a spectrum of the signal that depicts
ttain the signal as a variation of amplitude and phase in bands with in
,idth creasing center frequencies. Indeed, the channel signals of a phase
vocoder are commonly derived through a process of digital spectral
S are analysis using FFT. In this process, the overall signal waveform is
i bv cut into overlapping windowed [as in figure 4.10) segments. The
[C131 spectrum of each windowed waveform is obtained. Under certain
V the constraints depending on the shape of the window used, the succes—
/ 16 sive spectra describe the original waveform, and the original wave
3 form can be recovered from the successive spectra.
What is gained through the resolution of the overall waveform into
a number of narrowband waveforms that vary with time? In figure
4.13, the waveform in any “channel” [frequency range of the ana
gnal lyzed signal) will look very much like a sine wave whose frequency
1 be is the center frequency of that channel, and whose amplitude and
mo phase change slowly with time. This makes it possible to operate on
ted the channel signals in interesting ways.
ted Suppose, for example, we double the frequency of each channel Ids signal. That is, wherever the original signal goes up and down, we John Pierce 54 Filter bandwidth = lkHz /\—/ Center freq. = 500Hz Code. store. Filter bandwidth = lkHz
Centerfreq. = 1500Hz Output = Input
(or modified
input) transmit Filter bandwidth = lkHz
Center freq. = 2500Hz Input modify, Filter bandwidth = lkHz
Center freq. = 3500Hz Filter bandwidth : lkHz I Center freq. = 4500Hz W
Filter bandwidth = lkHz Center freq. = 5500Hz The phase vocoder Figure 4.13 The phase vocoder splits a signal into bands of equal spacing and bandwidth. construct a signal of approximately the same amplitude and phase
that goes up and down twice. Roughly at least, we double all fre
quencies in the original signal and shift its pitch upward by an oc
tave. What a ﬁne way to create speech or song of very high pitch!
Or, by halving the frequencies, to create speech or song of very low
pitch. Suppose that we delete every other sample in each narrowband
channel. For a ﬁxed ﬁnal sample rate this will double the speed of
the reconstructed signal without appreciably changing its spectrum.
Or, if we interpolate samples in the narrow channels, we can slow
the rate of speaking or singing. In the phase vocoder the signal is ﬁltered into overlapping bands
of equal bandwidth, as shown in ﬁgure 4.12. If the signal were di
vided into N overlapping bands, but the contour of each higher ﬁlter
were the contour of the lowest ﬁlter stretched out by a constant (2, 3,
4). we would have a representation of the overall sound by wavelets. 4.8 Wavelets and the Sampling Theorem
So far, we have assumed that the successive ﬁlters are identical in
bandwidth and are equally spaced in frequency, as shown in ﬁgure
4.13. In that ﬁgure the boxes represent amplitude versus frequency
for successive ﬁlters. Each ﬁlter contributes the same fraction to the = lnput
:liﬁed
input) iase
fre— OC OW nd
of )W ds
1i
er w<cns 4. Sound Waves and Sine Waves 55 total bandwidth. If the input to the phase vocoder is a very short
pulse, the output of each ﬁlter will have the same duration but a
different frequency. Now consider ﬁlters such that the bandwidth of the next higher
filter is broader than the preceding ﬁlter by some constant greater
than unity. A simple example of such filters is shown in ﬁgure 4.14.
In this figure the triangles that represent the response versus fre
quency of the individual ﬁlters are broader with increasing fre
quency. Each triangle is twice as broad and twice as far to the right
as its predecessor. The filters of ﬁgure 4.14 are a simple example of
overlapping ﬁlters in which the contour of the next higher ﬁlter is
just like that ofthe preceding ﬁlter but is broader by a constant factor
greater than 1. Suppose such a bank of ﬁlters is excited by a very
short pulse. The output of any one of the ﬁlters is called a wavelet.
An input waveform can be represented by a sum of such wavelets,
each having an amplitude appropriate to the signal to be repre—
sented. The time—frequency relationship is thus more appropriately
dealt with in the wavelet ﬁlter bank, in that for each higher—
frequency ﬁlter, the time impulse response is narrower. Thus, the
optimum time—frequency tradeoff can be approached in each
subband. Representation of musical waveforms by successions of wavelets
is related to, but different from, the phase vocoder’s spectral analy
sis, in which waveforms are represented by components that are
equal in frequency spacing and in width (in Hz]. Wavelets have
shown much promise for compression of images, but less promise
in audio. The use of wavelets for audio event recognition, however,
shows much more potential. 1% Bandwidth=200 W 1: : Bandwidth=400 \/\/\/\_
Wavelets
T : Bandwidth=800 W Filter Bank Impulse
input Figure 4.14 A wavelet filter bank. Filters have increasing bandwidth. Iohn Pierce 55 5 I; John 4.9 Closing Thoughts Sine waves and Fourier analysis are powerful resources in all stud ies of musical sounds. Most musical sounds are produced by linear systems, or by approximately linear systems (freely vibrating strings or columns of air), or by forced vibrations of such systems that reﬂect their linear nature. Parts of our organs of perception are approx imately linear. Higherlevel processing, while not linear in itself, reﬂects the approximately linear processing that is carried out at 5.1 F
lower levels. Analyses based on sine waves are of crucial value in understand
ing the nature of musical sounds and of their perception. Sine
waves, however, are not music. Nor are they even musical sounds
by themselves. Rather, they are ingredients out of which musical
sounds can be concocted, or through which musical sounds can be
analyzed and studied. What we do with sine waves, how we use
them, must be guided by the capabilities and limitations of human
perception. References Bracewell. R, N. [1986]. The Fourier Transform and Its Applications. Second edition. New
York: McGrawHill. An advanced book on Fourier analysis. Chui, C. K. [1992]. An Introduction to Wavelets, Boston: Academic Press. . ed, (1992). Wavelets: A Tutorial in Theory and Applications. Boston: Academic
Press. Gives more sense of actual application. Risset, I, C., and M, V, Mathews, [1969). “Analysis of Musical Instrument Tones.” Physics
Today, 22[2]: 23—40. Schafer. R, W., and J, D, Markel. eds. (1979). Speech Analysis, New York: IEEE Press. Con
tains many early papers on speech and on the phase and channel vocoders. Steiglitz. K, (1996]. A Digital Signal Processing Primer. Menlo Park, CA: Addison Wesley.
An excellent introductory reference to Fourier analysis, sinusoids, and linear systems. 5.2 P wm~nM_J. m ...
View
Full Document
 Spring '08
 GILDEN
 Cognitive Psychology, Sine Waves

Click to edit the document details