MML_exp3_speech_compression

MML_exp3_speech_compression - 1 Experiment 3 M ULTIMEDIA S...

Info iconThis preview shows pages 1–3. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: 1 Experiment 3 M ULTIMEDIA S IGNAL C OMPRESSION : S PEECH AND A UDIO I Introduction A key technology that enables distributing speech and audio signals without mass storage media or transmission bandwidth is compression, also known as coding. It reduces the amount of data needed to transmit and store digitally sampled audio either during analog-to-digital conversion step or after the raw file is stored digitally. Audio compression and decompression can be accomplished by various types of algorithms, which can be incorporated in software applications or programmed into special-purpose integrated-circuit chips. Several international standards have been established for audio and video coding. These include the MPEG-1 and MPEG-2. No international or national standards have been established for compressing and decompressing the waveform speech and audio files for desktop multimedia applications. Yet there are many schemes that user can choose from for compressing the waveform files. The following sections talk about the most commonly used algorithms and various types of compression methods for audio and speech compression. II Theories and Schemes We have already discussed the sampling theorem in experiment 2. There it was shown that samples of an analog signal are a unique representation of the signal if the analog signal is bandlimited and if the sampling rate is at least twice the signal frequency. Since we are concerned with digital representations of speech and audio signals, we need to consider the spectral properties of speech and audio. It has been observed that for voiced sounds, the high frequencies above 4 kHz are more than 40 dB below the peak of the spectrum. On the other hand, for audio signals, the spectrum does not fall off appreciably even above 8 kHz. Thus, to accurately represent all audio sounds would require a sampling rate greater than 20 kHz. In addition, for the computer to represent a sampled signal, the possible values taken by a sample, which varies in a continuous range, must be discretized to a finite set of values. This process is called quantization. II.1 Quantization of Sampled Signals II.1.1 Uniform Quantization The quantization ranges and levels may be chosen in a variety of ways depending on the intended applications of the digital representation. With uniform quantization, the dynamic range (minimum to maximum) of the signal R is divided into L equal sized intervals, each with length . We call the quantization step-size. The input (unquantized value) and output (quantized value) relationship in a uniform quantizer is shown in Fig. 3.1. There, x i represents the right boundary of interval i , and ) x i the quantization level of this interval. They satisfy 2 x x i i = 1 (3.1) and ) ) x x i i = 1 . (3.2) Any value in the i-th interval is mapped into the middle value in this interval, i.e....
View Full Document

Page1 / 21

MML_exp3_speech_compression - 1 Experiment 3 M ULTIMEDIA S...

This preview shows document pages 1 - 3. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online