Audio Processing Problems and Solutions

Audio Processing Problems and Solutions - Microcomputer...

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Microcomputer Systems 1 Audio Processing Problems and Solutions Audio Processing Problems and Solutions Automatic Gain/Volume Control Automatic Gain/Volume Control One of the simplest operations that can be performed in a DSP on an audio signal is volume gain and attenuation control. For fixedpoint math, this operation can be performed by multiplying each incoming sample by a fractional 16 bit value number between 0x0000.... and 0x7FFF.... or using a shifter to multiply or divide the sample by a power of 2. When increasing the gain of a signal, the programmer must be aware of overflow, underflow, saturation, and quantization noise effects. February 11, 2012 Veton Kpuska 4 Estimation of the Energy of the Signal Algorithm: Keep track of the maximum energy of the input signal. if (abs(in_sample) > myMax) { myMax = abs(sample); Ajust the Gain to cover 80% of the overall dynamic range of the output. new_target_gain = 0.8*MAX_RANGE/myMax; } Compute actual gain factor based on some empirically defined function that performs necessary smoothing based on desired responsiveness and smoothness of the gain. gain = compute_gain(gain, new_target_gain); Apply Gain: out_sample = gain*in_sample; Reset based on some criteria myMax Gain factor February 11, 2012 Veton Kpuska 5 Gain Update Function float compute_gain(gain, new_gain) { // Linear interpolation float g, alpha = 0.2; // computed gain will adjust gain // each time is called by 20% toward // target gain g = (1-alpha)*gain + alpha*new_gain; return (g); } February 11, 2012 Veton Kpuska 6 Efficient Moving Average x[ n - N ] + x[ n - ( N - 1) ] + x[ n - ( N - 2 ) ] + + x[ n - 2] + x[ n - 1] N x[ n - N ] x[ n - ( N - 1) ] x[ n - ( N - 2 ) ] x[ n - 2] x[ n - 1] xaver _ old = + + ++ + N N N N N x[ n - ( N - 1) ] + x[ n - ( N - 2 ) ] + x[ n - ( N - 3) ] + + x[ n - 1] + x[ n] xaver _ new = N x[ n - ( N - 1) ] x[ n - ( N - 2) ] x[ n - ( N - 3) ] x[ n - 1] x[ n] xaver _ new = + + ++ + N N N N N x[ n] x[ n - N ] xaver _ new - xaver _ old = - N N x[ n] x[ n - N ] xaver _ new = xaver _ old + - N N xaver _ old = February 11, 2012 Veton Kpuska 7 Moving Average Note that the algorithm needs to use slightly modified implementation of initialization routine for samples less than the number of averaging samples N (i.e., 64). count = 1; xaver _ old = x[ 0]; If ( count < = N ) { count x[count ] xaver _ new = xaver _ old + ; count + 1 count + 1 xaver _ old = xaver _ new ; } else { xaver _ new = xaver _ old + } February 11, 2012 Veton Kpuska 8 x[ n] x[ n - N ] - N N Amplitude Panning of Signals to Left or Right Stereo Field Reference: Using The Low Cost, High Performance ADSP21065L Digital Signal Processor For Digital Audio Applications Dan Ledger and John Tomarakos DSP Applications Group, Analog Devices, Norwood, MA 02062, USA Amplitude Panning of Signals to Left or Right Stereo Field In many applications, the DSP may need to process two (or more) channels of incoming data, typically from a stereo A/D converter. Twochannel recording and playback is still the dominant method in consumer and professional audio and can be found in mixers and home audio equipment. V. Pulkki [22] demonstrated placement of a signal in a stereo field (see Figure 4 below) using Vector Base Amplitude Panning. The formulas presented in Pulkki's paper for a two dimensional trigonometric and vector panning will be shown for reference. February 11, 2012 Veton Kpuska 10 Amplitude Panning of Signals to a Left or Right Stereo Field Normally, the stereo signal will contain an exact duplicate of the sampled input signal, although it can be split up to represent two different mono sources. Also, the DSP can also take a mono source and create signals to be sent out to a stereo D/A converter. Typical audio mixing consoles and multichannel recorders will mix down multiple signal channels down to a stereo output field to match the standardized configuration found in many home stereo systems. Figure 25 is a representation of what a typical panning control `pod' looks like on a mixing console or 8track home recording device, along with some typical pan settings: February 11, 2012 Veton Kpuska 11 Pan Control Three Typical Pan Control Settings of a Mono Source To A Stereo Output Field L R L R L R Full Left Pane Center Mix Full Right Pane February 11, 2012 Veton Kpuska 12 "Source of the Sound" To give the listener a sense of location within the output stereo filed, the DSP can simply perform a multiplication of the algorithmic result on both the left and right channel so that it is perceived from coming from a phantom source. Virtual source x 0 y sin ( ) g L - g R = sin ( 0 ) g L + g R 0 tan ( ) g L - g R = tan ( 0 ) g L + g R 00 < 0 < 900 Panning of TwoChannel Stereophonic Audio; Derived by Blumlein, Bauer and Bernfeld [26] February 11, 2012 Veton Kpuska 13 "Source of the Sound" Pulkki's Method [26] For Vector Panning of TwoChannel Audio Virtual source x p = gLIL + gR IR IR 0 IL y p 0 gRIR gLIL y L ( n ) = g L xL ( n ) February 11, 2012 Veton Kpuska y R ( n ) = g R xR ( n ) 14 Source of the Sound To create a panning effect of an audio channel to a particular position in the stereo output field, the programmer can use the Stereophonic Law of Sines, or the Tangent Law equation (Pulkki, Blumlein and Bauer[22], see Figure) where gL and gR are the respective gains of the left and right channels. sin ( ) g L - g R = ; where 0 0 < 0 < 90 0 ,-0 < < 0 , and g L , g R [0,1] sin ( 0 ) g L + g R This is valid if the listener's head is pointing straight ahead. If the listener turns the head to follow the virtual source, the Tangent Law equation as described by Pulkki [derived by Bernfeld, 26] is modified as: tan ( ) g L - g R = ; where 00 < 0 < 90 0 ,-0 < < 0 , and g L , g R [0,1] tan ( 0 ) g L + g R February 11, 2012 Veton Kpuska 15 Source of the Sound Assuming fixed point signed fractional arithmetic where signals are represented between 0 (0x0000...) and 0.99999 (0x7FFF...), the DSP programmer needs simply to multiply each signal by the calculated gain. Using Pulkki's Vector Base Amplitude Panning method as shown in the slide 11, the position p of the phantom sound source is calculated from the linear combination of both speaker vectors: p = gLIL + gR IR Veton Kpuska 16 February 11, 2012 Source of the Sound The output difference I/O equations for each channel are simply: y L ( n ) = g L xL ( n ) y R ( n ) = g R xR ( n ) February 11, 2012 Veton Kpuska 17 Vector Based Amplitude Panning Summary Left Pan: If the virtual source is panned completely to the left channel, the signal only comes out of the left channel and the right channel is zero. When the gain is 1, then the signal is simply passed through to the output channel. GL= 1 GR= 0 Right Pan: If the virtual source is panned completely to the right channel, the signal only comes out of the right channel and the left channel is zero. When the gain is 1, then the signal is simply passed through to the output channel. GL= 0 GR= 1 Center Pan: If the phantom source is panned to the center, the gain in both speakers are equal. GL= GR Arbitrary Virtual Positioning: If the phantom source is between both speakers, the tangent law applies. The resulting stereo mix that is perceived by the listener would be offscale left/right from the center of both speakers. Some useful design equations [26] are shown below: cos sin 0 + sin cos 0 gL = 2 sin cos 0 gR = cos sin 0 - sin cos 0 2 sin cos 0 sin 0 g L - g R = arctan g + g cos 0 L R 18 February 11, 2012 Veton Kpuska Table Lookup If DSP processor/IDE does not support trigonometric functions then table lookups can be stored with pre computed panning values for number of angles. Table below shows left and right channel gains required for the desired panning angle: February 11, 2012 Veton Kpuska 19 Graphic Equalizers Graphic Equalizers Professional and Consumer use equalizers to adjust the amplitude of a signal within selected frequency ranges. In a Graphic Equalizer, the frequency spectrum is broken up into several bands using band pass filters. Setting the different gain sliders to a desired setting gives a `visual graph' (Figure in the next slide) of the overall frequency response of the equalizer unit. The more bands in the implementation yields a more accurate desired response. February 11, 2012 Veton Kpuska 21 Graphic Equalizer Analog equalizers typically uses passive and active components. Increasing the number of bands results in a large board design. When implementing the same system in a DSP, however, the number of bands is only limited by the speed of the DSP (MIPs) while board space remains the same. Resisters and capacitors are replaced by discretetime filter coefficients, which are stored in a memory and can be easily modified. Figure 33 in the next slide shows and example DSP structure for implementing a 6 band graphic equalizer using second order IIR filters. The feedforward path is a fixed gain of 0.25, while each filter band can be multiplied by a variable gain for gain/attenuation. There are many methods of implementation for the second order filter, such as using ladder structures or biquad filters. Filter coefficients can be generated by a commercially available filter design package, where A and B coefficients can be generated in for the following 2nd order transfer function and equivalent I/O difference equations: February 11, 2012 Veton Kpuska 22 Graphic Equalizer Second Order IIR Filter B0 + B1 z -1 + B2 z -2 H ( z) = 1 - A1 z -1 - A2 z -2 Direct form II implementation equations are given below d [ m] = x( m ) - A2 d [ m - 2] - A1d [ m - 1] and ARMA implementation equation is given below y[ m] = B2 d [ m - 2] + B1d [ m - 1] + B0 d [ m] y[ m] = B0 x( m ) + B1 x[ m - 1] + B2 x[ m - 2] + A1 y[ m - 1] + A2 y[ m - 2] February 11, 2012 Veton Kpuska 23 Block Diagram of Graphic Equalizer DSP Implementation of a Digital Graphic Equalizer g1 x[n] nd 22nd Order IIR Band 1 Order IIR Band 1 Filter Filter g2 gMaster nd 22nd Order IIR Band 2 Order IIR Band 2 Filter Filter gi y[n] nd 22nd Order IIR Band i Order IIR Band i Filter Filter gN nd 22nd Order IIR Band Order IIR Band N Filter N Filter February 11, 2012 Veton Kpuska 24 TimeDelay Digital Audio Effects TimeDelay Digital Audio Effects Background theory and basic implementation of variety of time based digital audio effects will be examined. The figure below shows some algorithms that can be found in digital audio effects processor. Multiple reflection delay effects using delay lines will be discussed first then more intricate effects such as: Chorusing (animation of the basic sound by mixing it with two slightly detuned copies of itself) Flanging (is an audio process that combines two copies of the same signal, with the second delayed slightly (less than 20 msec), to produce a swirling effect Pitch shifting Reverberation (Reverberation is the persistence of sound in a particular space after the original sound is removed. When sound is produced in a space, a large number of echos build up and then slowly decay as the sound is adsorbed by the walls and air, creating reverberation, or reverb.) Veton Kpuska 26 February 11, 2012 Typical Signal Chain for Audio Multi Effects Processors Bypass Control Input Signal Comp ressor Distortion/ Overdrive Chorus/ Flanger Digital Delay/ Reverb Equalizer Output Signal February 11, 2012 Veton Kpuska 27 Digital Delay The Digital Delay is the simplest of all time delay audio effects. The delay effect is often the basis to produce more intricate effects such as flanging and chorusing, which vary the delay time onthefly. It is also used in reverberation algorithms to produce early reflections and recursive delays. To create reflection digitally, DSP delay effects units encode the input signal and store it digitally in a delayline buffer until it is required at the later time where it is decoded back to analog form [17]. The DSP can produce delays in a variety of ways. Delay units can produce stereo results and multipletapped delayed results [7]. Many effects processors implement a delay and use it as a basis for producing multitap and reverb effects. Multitapped signals can be panned to the right, left or mixed together to give the listener the impression of the stereo echo bouncing from one side to the other. February 11, 2012 Veton Kpuska 28 Digital Delay Line Delay Line with buffer size D x[n] ZD ZD y[n] = x[nD] Delay (sec) = Dbuffer _ size Tsample _ rate Delay (sec) = Dbuffer _ size f sample _ rate Veton Kpuska 29 February 11, 2012 Generation of Signals Signal Generation Methods of signal generation for wavetable synthesis, delayline modulation and tremolo effects can be produced by: Using a periodic lookup of a signal stored in the DSP's data memory. Generating signal in realtime Wavetable Generators can be used to implement many timedelay chorus, flanger, vibrato (Vibrato is a musical effect where the pitch or frequency of a note or sound is quickly and repeatedly raised and lowered over a small distance for the duration of that note or sound. Vibrato is naturally present in the human voice, and is used to add expression and vocallike qualities to instrumental notes.), and tremolo (Tremolo is the rapid repetition of one note in music or a rapid alternation between two or more notes). The figure in the next slide shows some of the more common signals that can be easily stored in memory for use in audio applications. modulation effects an amplitude effects such as the February 11, 2012 Veton Kpuska 31 Example of Signals February 11, 2012 Veton Kpuska 32 How to Generate Pure Tone Signals Digital Sinusoidal Signal: f0 x[ n] = A sin 2 n = A sin[ 2F0 n] = A sin[0 n] f s A amplitude f0 frequency of the sinusoid in Hz F0 normalized frequency in cycles per sample defined as: f0 1 1 , - F0 F0 = 2 2 fs February 11, 2012 Veton Kpuska 33 Generation of Basic Signals The range of normalized frequency is 1/2F01/2 since 2|f0|fs according to sampling theorem. Digital frequency 0, in radians per sample is defined as: 0 = F0 2 , - 0 Veton Kpuska February 11, 2012 34 VisualDSP++ Library Functions February 11, 2012 Veton Kpuska 35 Notes/Pitch Frequency A brief overview of Notes and their frequencies Notes/Pitch Note frequency (hertz) Technically, music can be composed of notes at any arbitrary frequency. Since the physical causes of music are vibrations of mechanical systems, they are often measured in hertz (Hz), with 1 Hz = 1 complete vibration cycle per second. For historical and other reasons especially in Western music, only twelve notes of fixed frequencies are used. These fixed frequencies are mathematically related to each other, and are defined around the central note, A4. The current "standard pitch" or "concert pitch" for this note is 440 Hz. A4 is the 440 Hz tone that serves as the standard for musical pitch. A440 is the musical note A above middle C (A4). February 11, 2012 Veton Kpuska 37 Notes/Pitch Frequency The note naming convention specifies a letter, any sharp/flat, and an octave number. C, D, E, F, G, A, B, C C, D, E, F, G, A, B, C Any note is exactly an integer number of halfsteps away from central A (A4). Let this distance be denoted n. Then, Frequency = 440 2 February 11, 2012 Veton Kpuska n 12 Hz 38 Notes/Pitch Frequency For example, let's find the frequency of the C above Middle A (C5). There are +3 halfsteps between A4 and C5 A -- (1) A-- (2) B -- (3) C 3 12 Frequency = 440 2 Hz 523.2511 Hz It is important to keep the sign of n in mind. For example, the F below Middle A is F4. There are 4 halfsteps: A -- (1) Ab -- (2) G -- (3) Gb -- (4) F ... each of these is descending the scale. Thus: Ab is called A Flat February 11, 2012 A is called A Sharp Frequency = 440 2 -4 12 Hz 349.2290 Hz 39 Veton Kpuska Octaves and Notes Finally, it can be seen from this formula that octaves (n=12) automatically yield factors of two times the original frequency (in fact this is the means to derive the formula, combined with the notion of equallyspaced intervals). For use with the MIDI (Musical Instrument Digital Interface) standard, a frequency mapping is defined by: f p = 69 + 12 log 2 440 For notes in an A440 equal temperament, this formula delivers the standard MIDI note number. Any other frequencies fill the space between the whole numbers evenly. This allows MIDI instruments to be tuned very accurately in any microtuning scale, including non western traditional tunings. February 11, 2012 Veton Kpuska 40 Implementation Approaches Implementation Approaches Most high level languages such a C/C++ have build in support to generate trigonometric functions. Realtime Embedded System Software Engineers who program DSP algorithms mostly in assembly do not have the flexibility of a high level language when generating signals. Various methods proposed by Crenshaw [8], Orfanidis [2] and Chrysafis [39] can be used for generating sinusoidal/random signals in a DSP. Signal generation can be achieved by: 1. Making a subroutine/function call to a Taylor Series function approximation for trigonometric signals, Uniform/Gaussian random number generator routine for random white noise generation. 1. 2. Using a table lookup Using hold/linear interpolation operations between consecutive locations in the wavetable to increase the resolution of the stored signal. February 11, 2012 Veton Kpuska 42 Implementation Approaches The advantage of using a wavetable to generate a signal: It is trivial to generate signal simply by performing a memory read from the buffer, therefore saving DSP cycle overhead. The wavetable can be implemented as a circular buffer so that the signal stored is regenerated over and over. The larger the buffer, the purer the signal that can be generated. With larger internal memory sizes integrated on many DSPs or the use of low cost commodity SDRAM, the option of using a lookup table is more easily achievable than in the past. February 11, 2012 Veton Kpuska 43 Implementation Approaches To save memory storage, the size of the table can be reduced by a factor of 2, and as suggested above, the DSP can interpolate between 2 consecutive values. For example, a wavetable buffer can contain 4000 locations to represent 1 period of a sine wave, and the DSP can interpolate in between every value to produce 8000 elements to construct the signal. This is not a bad approximation for generating a decent sounding tone What is the best way to progress through the table? The general recommendation for accessing data from the table would be to declare the wavetable in the DSP program as a circular buffer instead of as a linear buffer (see some examples in Figure in the next slide). This will allow the signal to be replayed over and over without the program having to check to see if the pointer needs to be reset. February 11, 2012 Veton Kpuska 44 Implementation Approaches 1. Two methods can be used to progress through the lookup table: 2. SampleRate Dependent Update: One method for updating a wavetable pointer is samplerate dependent update, where a new lookup value is generated every time the sample processing algorithm is entered (typically via an interrupt service routine). This synchronization with the sample rate will not introduce possible aliasing artifacts in implementing delay line modulation. DSP Timer Expire Update: Another method, would be to update the value in the table using the DSP's on chip programmable timer. Every time the timer expires and resets itself, the timer ISR can update the pointer to the wavetable buffer. This method allow movement through a table that is not relative to the converter's sampling rate, allowing for more flexible and precise timing of signal generation or delayline modulation. February 11, 2012 Veton Kpuska 45 Implementation Approaches For certain digital audio effects such as flanging/chorusing/pitch shifting, lookup table updates can be easily achieved using the programmable timer as well as via the audio processing ISR. Delayline modulation value can be easily updated by using: The programmable timer or An interrupt counter, to process the parameter used to determine how far back in the delayline buffer the DSP's data addressing unit needs to fetch a previously stored sample. A sine wavetable can be used to implement many time delay modulation effects and amplitude effects such as the chorus, flanger, vibrato, and tremolo. Random Low frequency oscillator (LFO) Tables can be used to implement realistic chorus effects [2]. Using a sawtooth wavetable will be useful for shifting the pitch of a signal [16]. We will look at these examples in more detail in subsequent sections. Veton Kpuska 46 February 11, 2012 Digital Delay: Single Reflection Delay Implementation of a Digital Delay with Single Tap x[n] D ZZD h[n] 1 *x[nD] y[n] 0 D To implement a single reflection of an input signal, the following difference equation can be used: y[ n] = x[ n] + x[ n - D ] Veton Kpuska February 11, 2012 47 Automatic Double Tracking (ADT) and Slapback Echo One popular use of the digital delay is to quickly repeat the input signal with a single reflection at unity gain. By making the delay an input signal around 1540 milliseconds, the resulting output produces a "slapback" or "doubling" effect (see Figure 1 in the previous slide). The slight differences in the delay create the effect of the two parts being played in unison. This effect can also be set up to playback the original "dry" signal in one stereo channel and the delayed signal in the other channel (Figure in the next slide). This creates the impression of a stereo effect using a single mono source. The same technique is used for a mono result, except both signals are added together. With short delays, slapback can "thicken" the sound of an instrument or voice when mixed for a mono result, although cancellations can occur from comb filtering side effects when the delay is under 10 ms, which will result in a hollow, resonant sound [2], [26]. February 11, 2012 Veton Kpuska 48 Automatic Double Tracking (ADT) and Slapback Echo Slapback Echo Effect Automatic Double Tracking/ "Stereo Doubling" yL[n] x[n] D ZZD 0.5 y[n] x[n] D ZZD gL yR[n] Small Delay Between 10 to 0.5 30 msec Small Delay Between 10 to g R 30 msec February 11, 2012 Veton Kpuska 49 Multitap Delays Multiple delayed values of an input signal can be combined easily to produce multiple reflections of the input. This can be done by having multiple taps pointing to different previous inputs stored into the delay line, or by having separate memory buffers at different sizes where input samples are stored. Typical Impulse Response of Multiple Delay Effect h[n] 1 1 2 3 4 5 5D 0 D 2D 3D 4D February 11, 2012 Veton Kpuska 50 Multitap Delay Difference Equation The difference equation is a simple modification of the single delay case. With M delays of the input (see Figure), the DSP processing algorithm would perform the following difference equation operation: y[ n] = x[ n] + 1 x[ n - D ] + 2 x[ n - 2 D ] + + M x[ n - MD ] x[n] D ZZD D ZZD D ZZD D ZZD 1 2 3 y[n] February 11, 2012 Veton Kpuska 51 Reverberation Effect Adding an infinite # of delays will create a rudimentary reverb effect by simulating reflections in a room. The difference equation then becomes an IIR comb filter: y[ n] = i x[ n - iD ] i =0 = x[ n] + i x[ n - iD ] i =1 = x[ n] + i -1 x[ n - iD ] i =1 y[ n] = x[ n] + y[ n - D ] February 11, 2012 Veton Kpuska 52 IIR Implementation of Reverberation Effect x[n] *y[nD] y[n] D ZZD y [ n ] = x[ n ] y [ n - D ] February 11, 2012 Veton Kpuska 53 2 Tap Multidelay Effect Implementation 2 Tap Multidelay Effect Implementation Described by Orfanidis [Intro. to Signal Processing] 1 2 x[n] D1 ZZD1 D2 ZZD2 0 0x[n] 1 1y1[nD1] 2 2y2[nD2] y[n] y2 [ n] = y1 [ n - D1 ] + 2 y2 [ n - D1 ] February 11, 2012 Veton Kpuska 54 y1 [ n] = x[ n] + 1 y1 [ n - D1 ] y[n] = 0 x[ n] + 1 y1 [ n - D1 ] + 2 y2 [ n - D2 ] Multitap `PingPong' Stereo Delay x[n] D1 ZZD1 yL[n] 1 1 2 D2 ZZD2 2 yR[n] Exercise: Write the difference equation describing the system. February 11, 2012 Veton Kpuska 55 Delay Modulation Effects Delay modulation effects are one of the more interesting type of audio effects that are computationally more complex. The technique used is often called DelayLine Interpolation, where the delayline center tap is modified, usually by some low frequency waveform. The result of interpolating/decimating samples within the delay line results in a slight pitch change of the input signal. Thus, one of type of pitch shift algorithms can fall under this category although there are other DSP methods for pitch shifting. Effects listed below fall under DelayLine Modulation: Chorus Simulation of multiple instruments/voices Flanger "Swooshing Jet Sound" Doppler Pitch Change increase/decrease of an object moving toward/away from listener. Pitch Shifting Changing frequency of an input source Doubling Adding a small delay/pitch change with an input source. Leslie Rotating Speaker Emulation Combination of Vibrato and Tremolo. February 11, 2012 Veton Kpuska 56 General Structure of Delay Line Modulation f Feedback Gain y[n] = 1e[ n] + 2 e[ n - d ( n ) ] e[n] = x[n] - f e[n - D fixed ] 2 2e[nd(n)] DelayLine Gain x[n] Fixed Center Tap e[n] ZZ N N y[n] Modulating Center of DelayLine 1 1e[n] N=Variable Delay d(n) February 11, 2012 Veton Kpuska 57 Visualization of Circular Buffer Pointer Position Circular Buffer Pointer Buffer for Rotating Center Tap Pitch Drop Circular Buffer Pointer Position D D/2 Pitch Increase February 11, 2012 Veton Kpuska 58 Delay Line Modulation The above general structure (Figure in previous slide 48) described by J. Dattorro [6] will allow the creation of many different types of delay modulation effects. Each input sample is stored into the delay line, while the moving output tap will be retrieved from a different location in the buffer rotating from the tap center (Figure in previous slide 49). When the small delay variations are mixed with the direct sound, a timevarying comb filter results [2, 6]. y[n] = 1e[ n] + 2 e[ n - d ( n ) ] e[n] = x[n] - f e[n - D fixed ] February 11, 2012 Veton Kpuska 59 Delay Line Modulation As we will see, the above general structure will allow the creation of many different types of delay modulation effects. Each input sample is stored into the delay line, while the moving output tap will be retrieved from a different location in the buffer rotating from the tap center. If a delay of an input signal is very small (around 10 msec), the echo mixed with the direct sound will cause certain frequencies to be enhanced or canceled (due to the comb filtering effect). This will cause the output frequency response to change. By varying the amount of delay time when mixing the direct and delayed signals together, the variable delay lines create some amazing sound effects such as chorusing and flanging. February 11, 2012 Veton Kpuska 60 Flanger Effect Flanger Effect Historical background Flanging was coined by the way it was accidentally discovered. As legend has it, a recording engineer was recording a signal onto 2 reeltoreek tapedecks and monitored from both playback heads of the 2 tapedecks at the same time. While trying to simulate the ADT or doubling effect, it was discovered that small changes in the tape speed between the 2 decks created a `swooshing' jet sound. This effect was further enhanced by repeatedly leaning on the flanges of one of the tape reels slightly to slow the tape down. Thus the flanger was born. Dictionary.com: flange (fl nj) Pronunciation Key n. A protruding rim, edge, rib, or collar, as on a wheel or a pipe shaft, used to strengthen an object, hold it in place, or attach it to another object. Webster.com: flange Pronunciation: 'flanj Function: noun Etymology: perhaps alteration of flanch a curving charge on a heraldic shield 1 : a rib or rim for strength, for guiding, or for attachment to another object <a flange on a pipe> <a flange on a wheel> 2 : a projecting edge of cloth used for decoration on clothing <a jacket with flange shoulders> February 11, 2012 Veton Kpuska 62 Example Flange February 11, 2012 Veton Kpuska 63 Flanger Effect It is very easy to recreate this effect using a DSP. Flanging can be implemented in a DSP by varying the input signal with a small, variable time delay at a very low frequency between 0.25 to 25 milliseconds and adding the delayed replica with the original input signal (Figure next slide). When the time delay offset is varied by rotating the delayline center tap, the inphase and outofphase frequencies as a result of the comb filtering sweep up and down the frequency spectrum (Figure slide after next). The "swooshing" jet engine effect created as a result is referred to as flanging. February 11, 2012 Veton Kpuska 64 General Structure of Delay Line Modulation Sine Generator Modulates Tap Center of Delay Line x[n] SINE SINE y[n] = 1 x[ n] + 2 x[ n - d ( n ) ] 2 2x[nd(n)] DelayLine Gain y[n] ZZ N N Modulating Center of DelayLine 1 1x[n] N=Variable Delay d(n) February 11, 2012 Veton Kpuska 65 Frequency Response of the Flanger February 11, 2012 Veton Kpuska 66 Implementation of Varying Delay d(n) Flanging is created by periodically varying delay d(n). The variations of the delay time (or delay buffer size) can easily be controlled in the DSP using a lowfrequency oscillator sine wave lookup table that calculates the variation of the delay time, and the update of the delay is determined on a sample basis or by the DSP's onchip timer. To sinusoidally vary the delay between 0< d(n) < D, the on chip timer interrupt service routine should calculate the following equation described by Orfanidis [2]: D d ( n ) = 1 - cos( 2nf cycle ) 2 February 11, 2012 Veton Kpuska [ 67 Flanger Effect Example February 11, 2012 Veton Kpuska 68 Chorus Effect Chorus Effect Chorusing is used to "thicken" sounds. Time delay algorithm (typically between 15 and 35 milliseconds) is designed to duplicate the effect that occurs when many musicians play the same instrument and same music part simultaneously. Musicians are usually synchronized with one another, but there are always slight differences in timing, Volume, and Pitch. Those differences are caused due to: Non identical instruments Variations of plying styles. This chorus effect can be recreated digitally with a variable delay line rotating around the tap center, Adding the timevarying delayed result together with the input signal. Examples: 6string guitar can be "chorused" to sound more like a 12string guitar. Vocals can be "thickened" to sound like more than one musician in singing. February 11, 2012 Veton Kpuska 70 Chorus Effect Note: Chorus algorithm is similar to flanging, using the same difference equation, except the delay time is longer. With a longer delayline, the comb filtering is brought down to the fundamental frequency and lower order harmonics (see Figure in the next slide). Next to block diagrams represent the structure of the chorus effect simulating 2 and 3 instruments. February 11, 2012 Veton Kpuska 71 Implementation of a Chorus Effect Simulating 2 Instruments Random Low Frequency Oscillator LFO LFO y[n] = 1 x[ n] + 2 x[ n - d ( n ) ] 2 2x[nd(n)] DelayLine Gain y[n] x[n] ZZ N N Modulating Center of DelayLine 1 1x[n] N=Variable Delay d(n) February 11, 2012 Veton Kpuska 72 Implementation of a Chorus Effect Simulating 3 Instruments y[n] = 0 x[ n] + 1 x[ n - d1 ( n ) ] + 2 x[ n - d 2 ( n ) ] Random Low Frequency Oscillator LFO2 LFO2 N ZZN 2 2x[nd2(n)] LFO1 LFO1 x[n] ZZ N N 1 1x[nd1(n)] DelayLine Gain y[n] Modulating Center of DelayLine 0 0x[n] N=Variable Delay d(n) February 11, 2012 Veton Kpuska 73 Variation of 3 Instrument Chorus Effect One variation of the 3 Instrument Chorus Effect described by Orfanidis [2] is given below: y[n] = 0 x[ n] + 1 ( n ) x[ n - d1 ( n ) ] + 2 ( n ) x[ n - d 2 ( n ) ] 1(n) & 2(n) can be a lowfrequency random numbers (variable gain) with unity mean. The small variations in the delay time can be introduced by a random LFO at around 3 Hz. A low frequency random LFO lookup table can be used to recreate the random variations of the musicians, although the circular buffer will still be periodic. Veton Kpuska 74 February 11, 2012 Chorus Result of Adding a Variable Delayed Signal to it's Original February 11, 2012 Veton Kpuska 75 Varying Delay Implementation The varying delay d(n) will be updated using the following equation: d [ n] = D( 0.5 + LFO[ n] ) d [ n] = D( 0.5 + v[ n] ) The signal v(n) is described by Orfanidis [2] as a zero mean lowfrequency random signal varying between [ 0.5,0.5]. February 11, 2012 Veton Kpuska 76 Chorus Delay Parameters February 11, 2012 Veton Kpuska 77 Chorus Parameters Like the flanger, most units offer "Modulation" (or Rate) and "Depth" controls. Depth ( or Delay) controls the length of the delay line, allowing a user to change the length onthe fly. Determines how much the time offset changes during an LFO cycle. It combined with the delay line value for a total delay used to process the signal. Sweep Depth Modulation The variations in delay time will be introduced by a lowfrequency oscillator (LFO). This frequency can usually be controlled with the "Sweep Rate" parameter. Usually, the LFO consists of a low frequency random signal. When the waveform is at the largest value, variable delay that results will be the maximum delay possible. A result of an increasing slope in the LFO will cause the pitch to be lower. A negative slope will result in a pitch increase. February 11, 2012 Veton Kpuska 78 Examples: Sine and Triangle waves can be used to vary the delay time. One easy method for generating the modulation value is through a wavetable lookup. The value in the table can be modified on a sample basis via When the timer count expired and the DSP vectors off to the Timer Interrupt Service Routine, the modulation value can then be updated with the next value in the waveform buffer. The LFO can be repeated continuously by making the wavetable a circular buffer. Using a cosine wavetable, the varying delay d(n) will be updated using the following equation: the chorus routine, or the lookup can be determined using the DSP's onchip programmable timer. d [ n] = D( 0.5 + LFO 2nf Delay February 11, 2012 Veton Kpuska [ ) 79 Parameters of Delay Line Equation D Delay Line Length fDelay Frequency of the LFO with a period of 2 of the LFO. n the nth location in the wavetable lookup February 11, 2012 Veton Kpuska 80 Chorus Parameters The small variations in the time delays and amplitudes can also be simulated by varying them randomly at a very low frequency around 3 Hz. Where v[n] current variable delay value from the random LFO generator, or d [ n] = D( 0.5 + v[ n] ) d [ n] = D( 0.5 + random _ LFO _ number[ n] ) February 11, 2012 Veton Kpuska 81 Flanging/Chorusing Similarities & Differences Both Flanging and Chorusing use variable buffers to change the time delay on the fly. Both effects achieve these variations in delay time by using a low frequency oscillator (LFO). This parameter is available on commercial units as the "sweep rate". The "sweepdepth" parameter is what determines the amount of delay in the sweep period. The greater the depth, the farther the peaks and dips of the phase cancellation. The flanger found in many commercial units changes the delay using a low frequency sinewave generator, where The chorus usually changes the delay using a lowfrequency random noise generator. The key difference between the two effects is: February 11, 2012 Veton Kpuska 82 Vibrato Vibrato The vibrato effect that duplicates 'vibrato' in a singer's voice while sustaining a note, a musician bending a stringed instrument, or a guitarist using the guitars 'whammy' bar. This effect is achieved by evenly modulating the pitch of the signal. The sound that is produced can vary from a slight enhancement to a more extreme variation. It is similar to a guitarist moving the 'whammy' bar, or a violinist creating vibrato with cyclical movement of the playing hand. Some effects units offered vibrato as well as a tremolo. However, the effect is more often seen on chorus effects units. February 11, 2012 Veton Kpuska 84 Vibrato The slight change in pitch can be achieved (with a modified version of the chorus effect) by varying the depth with enough modulation to produce a pitch oscillation. This is accomplished by changing the modify value of the delayline pointer on thefly, and the value chosen is determined by a lookup table. This results in the interpolation/decimation of the stored samples via rotating the center tap of the delay line. The stored 'history' of samples are thus played back at a slower, or faster rate, causing a slight change in pitch. To obtain an even variation in the pitch modulation, the delay line is modified using a sine wavetable. This effect is often confused with 'tremolo', where the amplitude is varied by a LFO waveform. The tremolo and vibrato can both be combined together with a timevarying LPF to produce the effect produced by a rotating speaker (commonly referred to a 'Leslie' Rotating Speaker Emulation). Note that this a stripped down of the chorus effect, in that the direct signal is not mixed with the delayline output. February 11, 2012 Veton Kpuska 85 Implementation of the Vibrato Effect y[n] = 0 x[ n - d (n)] Sine Sine Wave Wave 0 x[n] ZZ N N 0x[nd(n)] DelayLine Gain y[n] Modulating Center of DelayLine N=Variable Delay d(n) February 11, 2012 Veton Kpuska 86 Pitch Shifter Pitch Shifter An interesting and commonly used effect is changing the pitch of an instrument or voice. The algorithm that can be used to implement a pitch shifter is the chorus or vibrato effect. The chorus routine is used if the user wishes to include the original and pitch shifted signals together. The vibrato routine can be used if the desired result is only to have pitch shifted samples, which is often used by TV interviewers to make an anonymous persons voice unrecognizable. The only difference from these other effects is the waveform used for delay line modulation. The pitch shifter requires using a sawtooth/ramp wavetable to achieve a 'linear' process of dropping and adding samples in playback from an input buffer. The slope of the sawtooth wave as well as the delay line size determines the amount of pitch shifting that is performed on the input signal. Veton Kpuska 88 February 11, 2012 Implementation of the Generic Pitch Shifter SawTooth SawTooth Wave Wave 0 x[n] ZZ N N 0x[nd(n)] DelayLine Gain Modulating Center of DelayLine Low Pass Low Pass Filter Filter y[n] N=Variable Delay d(n) February 11, 2012 Veton Kpuska 89 Pitch Shifter Click Effects: The audible side effect of using the 2 instrument chorus algorithm (with one delay line) is the `clicks' that are produced whenever the delay pointer passes the input signal pointer when samples are added or dropped. This is because output pointer is moving through the buffer at a faster/slower rate than the input pointer, thus eventually causing an overlap. To reduce or eliminate this undesired artifact crossfading techniques can be used between two alternating delay line buffers with a windowing function, so when one of delay line output pointers are close to the input, a zero crossing will occur at the overlay to avoid the 'pop' that is produced. For higher pitch shifted values, there is a noticeable 'warble' audio modulation produced as a result of the outputs of the delay lines being out of phase, which causes periodic cancellation of frequencies to occur. Warble Effect: February 11, 2012 Veton Kpuska 90 Example Two Voice Pitch Shifter Implementation Saw Tooth Wave Generators Saw Tooth 2 Saw Tooth 2 N ZZN x[nd(n)] Cross Fade Function y[n] Saw Tooth 1 Saw Tooth 1 x[n] ZZ N N x[nd(n)] DelayLine Gain Modulating Center of DelayLine x[n] N=Variable Delay d(n) February 11, 2012 Veton Kpuska 91 Detune Effects Detune Effects The Detune Effect is actually a version of the Pitch Shifter. The pitch shifted result is set to vary from the input by about +/1% of the input frequency. This is done by setting the Pitch Shift factor to 0.99 or 1.01 The effect's result is to increase or decrease the output and combine the pitch shift with the input to vary a few Hz, resulting in an `out of tune effect'. (The algorithm actually uses a version of the chorus effect with a sawtooth to modulate the delayline). Small pitch scaling values produce a `chorus like' effect and imitates two instruments slightly out of tune. This effect is useful on vocal tracks to give impression of 2 musicians singing the same part using 1 person's voice. The pitch shifting result is to small for the formant frequencies of the vocal track to be affected, so the shifted voice still sounds realistic. For a strong Detune Effect, vary the pitch by 510 Hz For a weak Detune Effect ( 'Sawtooth Chorus' sound ), vary the pitch by 23 Hz February 11, 2012 Veton Kpuska 93 Digital Reverberation Algorithms for Simulation of Large Acoustic Spaces Digital Reverberation Algorithms for Simulation of Large Acoustic Spaces Reverberation is another timebased effect. More complex processing than echoing, chorusing or Flanging. Reverberation is often mistaken with delay or echo effects. Most multieffects processing units provide a variation of both effects. The first simulation reverb units in the 60's and 70's consisted of using a mechanical spring or plate attached to a transducer and passing the electrical signal through. Another transducer at the other end converted the mechanical reflections back to the output transducer. However, this did not produce realistic reverberation. M.A Schroeder and James A. Moorer developed algorithms for producing realistic reverb using a DSP. February 11, 2012 Veton Kpuska 95 Reverberation of Large Acoustic Spaces February 11, 2012 Veton Kpuska 96 Reverberation of Large Acoustic Spaces The reverb effect simulates the effect of sound reflections in a large concert hall or room (Figure in previous slide). Instead of a few discrete repetitions of a sound like a multitap delay effect, the reverb effect implements many delayed repetitions so close together in time that the ear cannot distinguish the differences between the delays. The repetitions are blended together to sound continuous. The sound source goes out in every direction from the source, bounces off the walls and ceilings and returns from many angles with different delays. Reverberation is almost always present in indoor environments, and the reflections are greater for hard surfaces. As Figure in the next slide below shows, Reverberated Sound is classified as three components: Direct Sound, Early reflections and the Closely Blended Echo's (Reverberations) [11,12,14]. Veton Kpuska 97 February 11, 2012 Impulse Response for Large Auditorium Reverberations February 11, 2012 Veton Kpuska 98 Large Auditorium Reverberations Direct Sound directly reaches the listener from the sound source. Early reflections early echos which arrive within 10 ms to 100 ms by the early reflections of surfaces after the direct sound. Closely Blended Echos is produced after 100 ms early reflections. February 11, 2012 Veton Kpuska 99 Large Auditorium Reverberations Figure in previous slide shows an impulse response of a large acoustic space, such as an auditorium or gymnasium. Early Reflections: In a typical large auditorium, the first distinct delay responses that the user will hear are termed `early reflections'. These early reflections are a few relatively close echos that actually occur in as reverberation in large spaces. The early reflections are the result of the first bounce back of the source by surfaces that are nearby. Echos: Next come echos which follow one another at such small intervals that the later reflections are no longer distinguishable to the human ear. A Digital Reverb typically will process the input through multiple delayed filters and add the result together with early reflection computations. Various parameters to consider in the algorithm would be the decay time (time it takes for reverb to decay) , presense (dry signal output vs. reverberations), and tone control (bass or treble) of the output reverberations. February 11, 2012 Veton Kpuska 100 Digital Reverberation M.A. Schroeder suggested 2 ways for producing a more realistic sounding reverb. Approach I: Approach II: The first approach was to implement 5 allpass filters cascaded together. The second way was to use 4 comb filters in parallel, summing their outputs, then passing the result through 2 allpass filters in cascade. February 11, 2012 Veton Kpuska 101 Digital Reverberation James A. Moorer expanded on Schroeder's research. One drawback to the Schroeder Reverb is that the high frequencies tend to reverberate longer than the lower frequencies. Moorer proposed using a low pass comb filter for each reverb stage to enlarge the density of the response. He demonstrated a technique involving 6 parallel comb filters with a low pass output, summing their outputs and then sending the summed result to an allpass filter before producing the final result. Moorer also recommended including the simulation of the early reflections common in concert halls using a tappeddelay FIR filter structure, along with the reverb filters for a more realistic response. Some initial delays can be added to the input signal by using an FIR filter ranging from 0 to 80 milliseconds. Moorer chose appropriate filter coefficients to produce 19 early reflections. Moorer's reverberator produced a more realistic reverb sound than Schroeder's, but still produces a rough sound for impulse signals such as drums. February 11, 2012 Veton Kpuska 102 James A. Moorer's Digital Reverberation Structure February 11, 2012 Veton Kpuska 103 Reverb Building Blocks Low Pass Comb Filter and All Pass Filter Structures For realistic sounding reverberation, the DSP requires the use of large delay lines for both the comb filter and early reflection buffers. The comb filter is used to increase echo density and give the impression of a sizable acoustic space. Each comb filter incorporates a different length delay line. Each delay line can be tuned to a different value to provide a different room size characteristic. Fine tuning of input and feedback gains for each comb filter gain and comb filter delayline sizes will vary the reverberation response. Since these parameters are programmable, the decay response can be modified on the fly to change the amount of the reverb effect for simulation of a large hall or small room. The total reverberation delay time depends on the size the comb filter/early reflections buffers and the sampling rate. Low pass filtering in each comb filter stage reduces the metallic sound and shortens the reverberation time of the high frequencies, just as a real auditorium response does. The allpass filter is used along with the comb filters to add some color to the 'colorless/flat' sound by varying the phase, thus helping to emulate the sound characteristics of a real auditorium. Veton Kpuska 104 February 11, 2012 IIR Implementation of Reverberation Effect x[n] + u[n] 0 v0[n] + 1 ZZ1 D ZZD y[n] + v1[n] 1 1 February 11, 2012 Veton Kpuska 105 Amplitude Effects Volume Control Amplitude Panning (Trigonometric / VectorBased) Tremolo (Auto Tremolo) "PingPong" Panning (Stereo Tremolo) Dynamic Range Control Compression Expansion Limiting Noise Gating AmplitudeBased Audio Effects AmplitudeBased audio effects simply involve the manipulation of the amplitude level of the audio signal, from simply attenuating or increasing the volume to more sophisticated effects such as dynamic range compression/expansion. Below is a list of effects that can fall under this category: Volume Control Amplitude Panning (Trigonometric / VectorBased) Tremolo (Auto Tremolo) tremolo (Tremolo is the rapid repetition of one note in music or a rapid alternation between two or more notes). "PingPong" Panning (Stereo Tremolo) Dynamic Range Control Compression Expansion Limiting Noise Gating February 11, 2012 Veton Kpuska 107 Signal Level Measurement There are many ways to measure the amplitude of a signal. The technique described below uses a simple signal averaging algorithm to determine the signal level. It rectifies the incoming signal and averages it with the N1 previous rectified samples. Notice, however, that it only requires 6 instructions to average N values. This is because we are not recalculating the summation of N values and dividing this sum by N but rather updating a running average. This is how it works: Veton Kpuska February 11, 2012 108 Efficient Moving Average x[ n - N ] + x[ n - ( N - 1) ] + x[ n - ( N - 2 ) ] + + x[ n - 2] + x[ n - 1] N x[ n - N ] x[ n - ( N - 1) ] x[ n - ( N - 2 ) ] x[ n - 2] x[ n - 1] xaver _ old = + + ++ + N N N N N x[ n - ( N - 1) ] + x[ n - ( N - 2 ) ] + x[ n - ( N - 3) ] + + x[ n - 1] + x[ n] xaver _ new = N x[ n - ( N - 1) ] x[ n - ( N - 2) ] x[ n - ( N - 3) ] x[ n - 1] x[ n] xaver _ new = + + ++ + N N N N N x[ n] x[ n - N ] xaver _ new - xaver _ old = - N N x[ n] x[ n - N ] xaver _ new = xaver _ old + - N N xaver _ old = February 11, 2012 Veton Kpuska 109 Moving Average Note that the algorithm needs to use slightly modified implementation of initialization routine for samples less than the number of averaging samples N (i.e., 64). count = 1; xaver _ old = x[ 0]; If ( count < = N ) { count x[count ] xaver _ new = xaver _ old + ; count + 1 count + 1 xaver _ old = xaver _ new ; } else { xaver _ new = xaver _ old + } February 11, 2012 Veton Kpuska 110 x[ n] x[ n - N ] - N N Dynamic Processing Dynamic Processing algorithms are used to change the dynamic range of a signal. This means altering the distance in volume between the softest sound and the loudest sound in a signal. There are two types of dynamic processing algorithms: Compressors/Limiters and Expanders February 11, 2012 Veton Kpuska 111 Compressors and Limiters The function of a compressor and limiter is to keep the level of a single sample within a specific dynamic range. This is done using a technique called gain reduction. A gain reduction limits the additional gain above a threshold setting by a certain ratio, ultimately keeping signal from going past the specific level. February 11, 2012 Veton Kpuska 112 Compressors and Limiters Compressors and limiters have many applications Limiting dynamic range of the signal so it can be transmitted through a medium with a limited dynamic range. Compressing the signal to prevent from distortion that results due to overdriven mixer circuitry. February 11, 2012 Veton Kpuska 113 Compressors There are two primary parameters for a compressor: Threshold: Ratio Signal level at which the gain reduction begins, and The amount of gain reduction that takes place past the threshold. Example: Ratio 2:1 would reduce the signal by a factor of two when it passed the threshold level as seen in the first compressor. Veton Kpuska 114 February 11, 2012 Compression Example Ratio 2:1 Threshold 0 [dB] February 11, 2012 Veton Kpuska 115 Compressors Parameters Two other parameters commonly found in compressors are: Attack time, Is the amount of time it takes the compressor to begin compressing a signal once it has crossed the threshold, This helps preserve the natural dynamics of a signal, and Release time Is the amount of time it takes the compressor to stop attenuating the signal once its level has passed below the threshold. Veton Kpuska 116 February 11, 2012 Limiter A compressor with a ratio greater than about 10:1 is considered a limiter. The effect of the limiter is more like a clipping effect than a dampening effect of a lowratio compressor. Clipping effects add many gross harmonics to a signal. These harmonics increase in number and amplitude as the threshold level is lowered. February 11, 2012 Veton Kpuska 117 Examples: February 11, 2012 Veton Kpuska 118 Examples (cont.): February 11, 2012 Veton Kpuska 119 Examples (cont.) February 11, 2012 Veton Kpuska 120 Example (cont.): February 11, 2012 Veton Kpuska 121 Noise Gate/Downward Expander A noise gate or downward expander is used to reduce the gain of a signal below a certain threshold. It is useful for reducing and eliminating noise on a line when no signal is present. The relationship between a noise gate and downward expander is similar to that of the limiter and compressor. A noise gate cuts signals that fall below a certain threshold Downward expander has a ratio at which it dampens signals below a threshold. Veton Kpuska 122 February 11, 2012 Noise Gate Characteristics February 11, 2012 Veton Kpuska 123 Noise Gate Example February 11, 2012 Veton Kpuska 124 Expanders An expander is a device used to increase the dynamic range of a signal and complements compressors. Typically it is used to restore signals dynamic range altered by a compressor. For example, a signal with a dynamic range of 70 dB might pass through an expander and exit with a new dynamic range of 100 dB. February 11, 2012 Veton Kpuska 125 Expander Properties Expander Properties are characterized by the same parameters as Compressor: Threshold Ratio Attack Time Release Time February 11, 2012 Veton Kpuska 126 Example February 11, 2012 Veton Kpuska 127 Sound Synthesis Techniques Additive Synthesis FM Synthesis Wavetable Synthesis Audio Synthesis Concepts Three main elements in computer music: 1. The development of software to emulate the acoustics of a real or imaginary instrument 2. The editing capability to deal with a musical score 3. Software to integrate the above elements into a performance. February 11, 2012 Veton Kpuska 129 Direct Synthesis BlockDiagram Compilers and specialized computer languages are used to program (i.e. compose) by defining: Instruments/Sound Score: Durations, Pitch, Loudness February 11, 2012 Veton Kpuska 130 Sound Synthesis Techniques Sound synthesis is a technique used to create specific waveforms. It is widely used in the audio market in products that digitally recreate musical instruments and other sound effects like Sounds cards, and Synthesizers The most simple forms of sound synthesis such as FM and Additive Synthesis use basic harmonic recreation of a sound using the addition and multiplication of sinusoids of varying: Frequency, Amplitude, and Phase. Veton Kpuska 131 February 11, 2012 Sound Synthesis Techniques Sample playback and Wavetable synthesis: Subtractive Synthesis and Physical Modeling Use digital recordings of a waveform played back at varying frequencies to achieve lifelike reproduction of the original sound. Attempt to simulate the physical model of an acoustic system. Veton Kpuska 132 February 11, 2012 Additive Synthesis Fourier theory implies that any periodic sound can be constructed of sinusoids of various frequency, amplitude and phase. Additive Synthesis is the process of summing such sinusoids to produce a wide variety of composite signals. All three fundamental properties of the individual sinusoids are combined to accurately reproduce variety of instruments; namely Frequency, Amplitude Phase February 11, 2012 Veton Kpuska 133 Direct Synthesis Direct Synthesis requires a specific software module that generates sounds: Oscillator Noise generator Frequency Shifters, Envelope Generators, Other Block functions February 11, 2012 Veton Kpuska 134 Direct Synthesis Additive Synthesis f1 1 Sine Sine Wave Wave i Sine Sine Wave Wave i+1 Sine Sine Wave Wave N Sine Sine Wave Wave fi fi+1 fN Envelope of an Instrument Note 4 0 3 5 3 0 2 5 2 0 1 5 1 0 5 0 0 y[n] 10 0 20 0 30 0 40 0 50 0 60 0 70 0 80 0 90 0 10 00 Synthesized tone of an Instrument February 11, 2012 Veton Kpuska 135 Additive Synthesis Comparing to other synthesis techniques, additive synthesis can require a significant amount of processing power based on the number of sinusoidal oscillators used. There is a direct relationship between the number of harmonics generated and the number of processor cycles required. Below is the basic formula: y[ n] = A1 sin ( 2f1n + 1 ) + A2 sin ( 2f 2 n + 2 ) + + AN sin ( 2f N n + N ) = sin ( 2f i n + i ) i =1 February 11, 2012 Veton Kpuska 136 N FM Synthesis FM Synthesis is similar to additive synthesis in that it uses simple sinusoids to create a wide range of sounds. This is achieved by using one finite formula to create an infinite number of harmonics. The equation shown below uses a fundamental sinusoid which is modulated by another sinusoid: y[ n] = A( n ) sin ( 2f c n + sin ( 2f m n ) ) When this equation is expanded (Fourier Series), it contains infinite number of harmonics: February 11, 2012 Veton Kpuska 137 FM Synthesis y[ n] = J 0 ( ) sin ( 2 f c n ) + J1 ( ) [ sin ( 2 ( f c + f m ) n ) - sin ( 2 ( f c - f m ) n ) ] + J 2 ( ) [ sin ( 2 ( f c + 2 f m ) n ) + sin ( 2 ( f c - 2 f m ) n ) ] + J 3 ( ) [ sin ( 2 ( f c + 3 f m ) n ) - sin ( 2 ( f c - 3 f m ) n ) ] = J 0 ( ) sin ( 2 f c n ) + J k ( ) sin ( 2 ( f c + kf m ) n ) + ( - 1) sin ( 2 ( f c - kf m ) n ) k k =1 [ Note: Jk is the kth order Bessel function, and is the modulation index. February 11, 2012 Veton Kpuska 138 Wavetable Synthesis Wavetable synthesis is a popular and efficient technique for synthesizing sounds, especially in sound cards and synthesizers. Using a lookup table of prerecorded waveforms, the wavetable synthesis engine repeatedly plays the desired waveform or combinations of multiple waveforms to simulate the timbre of an instrument. The looped playback of the sample can be also modulated by an amplitude function which controls its attack, decay, sustain and release to create an even more realistic reconstruction of the original instrument. February 11, 2012 Veton Kpuska 139 Example February 11, 2012 Veton Kpuska 140 Wavetable Synthesis This method of synthesis is simple to implement and is computationally efficient. In a DSP, the desired waveforms can be loaded into a circular buffers to allow for zerooverhead looping. The only real computational operations will be adding multiple waveforms, calculating the amplitude envelope and modulating the looping sample with it. The downside of wavetable synthesis is that it is difficult to approximate rapidly changing spectra. February 11, 2012 Veton Kpuska 141 Sample Playback Sample Playback is another computationally efficient synthesis technique that yields extremely high sound quality. An entire sample is stored in memory for each instrument which is played back at a selected pitch. Often times, these samples will have loop points within them which can be used to alter the duration of the sustain thus giving an even more lifelike reproduction of the sound. February 11, 2012 Veton Kpuska 142 Sample Playback Although this method is capable of producing extremely accurate reproductions of almost any instrument, it requires large amounts of memory to hold the sampled instrument data. For example, to duplicate the sound of a grand piano, the sample stored in memory would have to be about 5 seconds long. If this sample were stereo and sampled at 44.1 [kHz], this single instrument would require 441,000 words of memory! To recreate many octaves of a piano, the system would require multiple piano samples because slowing down a sample of a C5 on a piano to the pitch of C2 would sound nothing like an actual C2. This technique is widely used in highend keybords and sound cards. Just like wavetable synthesis, sample playback requires very little computational power. It can be easily implemented in a DSP using circular buffers with simple looppoint detection. February 11, 2012 Veton Kpuska 143 Subtractive Synthesis Subtractive synthesis begins with a signal containing all of the required harmonics of a signal and selectively attenuating (or boosting) certain frequencies to simulate the desired sound. The amplitude of the signal can be varied using an envelope function as in the other simple synthesis techniques. This technique is effective at recreating instruments that use impulselike stimulus like a plucked string or drum. February 11, 2012 Veton Kpuska 144 THE END ...
View Full Document

This note was uploaded on 02/10/2012 for the course ECE 3551 taught by Professor Staff during the Spring '11 term at FIT.

Ask a homework question - tutors are online