|
Introduction
Basics
of Audio Compression
Advances in digital audio technology are fueled by
two sources: hardware developments and new signal processing techniques. When
processors dissipated tens of watts of power and memory densities were on the
order of kilobits per square inch, portable playback devices like an MP3 player
were not possible. Now, however, power dissipation, memory densities, and processor
speeds have improved by several orders of magnitude. Advancements in signal processing
are exemplified by Internet broadcast applications: if the desired sound quality
for an internet broadcast used 16-bit PCM encoding at 44.1 KHz, such an application
would require a 1.4 Mbps (2 x 16 x 44k) channel for a stereo signal! Fortunately
new bit rate reduction techniques in signal processing for audio of this quality
are constantly being released.
Increasing
hardware efficiency and an expanding array of digital audio representation formats
are giving rise to a wide variety of new digital audio applications. These applications
include portable music playback devices, digital surround sound for cinema, high-quality
digital radio and television broadcast, Digital Versatile Disc (DVD), and many
others. This paper introduces digital audio signal compression, a technique essential
to the implementation of many digital audio applications. Digital audio signal
compression is the removal of redundant or otherwise irrelevant information from
a digital audio signal, a process that is useful for conserving both transmission
bandwidth and storage space. We begin by defining some useful terminology. We
then present a typical "encoder" (as compression algorithms are often
called) and explain how it functions. Finally consider some standards that employ
digital audio signal compression, and discuss the future of the field.
Psychoacoustics
is the study of subjective human perception of sounds. Effectively, it is the
study of acoustical perception. Psychoacoustic modeling has long-since been an
integral part of audio compression. It exploits properties of the human auditory
system to remove the redundancies inherent in audio signals that the human ear
cannot perceive. More powerful signals at certain frequencies 'mask' less powerful
signals at nearby frequencies by de-sensitizing the human ear's basilar membrane
(which is responsible for resolving the frequency components of a signal). The
entire MP3 phenomenon is made possible by the confluence of several distinct but
interrelated elements: a few simple insights into the nature of human psychoacoustics,
a whole lot of number crunching, and conformance to a tightly specified format
for encoding and decoding audio into compact bitstreams.
Terminology
Audio
Compression vs. Speech Compression This paper focuses on audio compression
techniques, which differ from those used in speech compression. Speech compression
uses a model of the human vocal tract to express particular signal in a compressed
format. This technique is not usually applied in the field of audio compression
due to the ast array of sounds that can be generated - models that represent audio
generation would be too complex to implement. So instead of modeling the source
of sounds, modern audio compression models the receiver, i.e., the human ear.
Lossless vs. Lossy When
we speak of compression, we must distinguish between two different types: lossless,
and lossy. Lossless compression retains all the information in a given signal,
i.e., a decoder can, perfectly reconstruct a compressed signal. In contrast, lossy
compression eliminates information, from the original signal. As a result, a reconstructed
signal may differ from the original. With audio signals, the differences between
the original and reconstructed signals only matter if they are detectable by the
human ear. As we will explore shortly, audio compression employs both lossy and
lossless techniques.
Basic
Building Blocks
Figure 1 shows a generic encoder or "compressor that
takes blocks of sampled audio signal as its input. These blocks typically consist
of between 500 and 1500 samples per channel, depending on the encoder specification.
For example, the MPEG-1 layer III (MP3) specification takes 576 samples per channel
per input block. The output is a compressed representation of the input block
(a "frame") that can be transmitted or stored for subsequent decoding.
You may also like this : Laser Communications, Solar Power Satellites, MIMO Wireless Channels, Fractal Robots, Stereoscopic Imaging, Ultra-Wideband, Home Networking, Digital Cinema, Face Recognition Technology, Universal Asynchronous Receiver Transmitter , Automatic Teller Machine , Wavelength Division Multiplexing , Object Oriented Concepts, Frequency Division Multiple Access , Real-Time Obstacle Avoidance, Delay Tolerant Networking , EDGE, Psychoacoustics , Integer Fast Fourier Transform, Worldwide Inter operatibility for Microwave Access , Code Division Multiple Access, Optical Coherence Tomography , Symbian OS , Home Networking , Guided Missiles , AC Performance Of Nanoelectronics , Acoustics , BiCMOS technology , Fuzzy based Washing Machine , Low Memory Color Image Zero Tree Coding , Stealth Fighter , Border Security Using Wireless Integrated Network Sensors , A Basic Touch-Sensor Screen System , GSM Security And Encryption, Design of 2-D Filters using a Parallel Processor Architecture , Software-Defined Radio , Smart Dust , Adaptive Blind Noise Suppression , An Efficient Algorithm for iris pattern , Significance of real-time transport Protocol in VOIP, Storage Area Networks , Quantum Information Technology , Money Pad, The Future Wallet, Buffer overflow attack , Robotic Surgery, Swarm intelligence & traffic Safety , Smart card , Cellular Through Remote Control Switch, Terrestrial Trunked Radio , HVAC,Electronics Seminar Reports, PPT and PDF.
|
<<back |