| High
Performance DSP Architectures |
Definition
Digital Signal Processing is carried out by mathematical operations. Digital
Signal Processors are microprocessors specifically designed to handle Digital
Signal Processing tasks. These devices have seen tremendous growth in the last
decade, finding use in everything from cellular telephones to advanced scientific
instruments. In fact, hardware engineers use "DSP" to mean Digital Signal
Processor, just as algorithm developers use "DSP" to mean Digital Signal
Processing. DSP has become a key component in many consumer, communications, medical,
and industrial products. These products use a variety of hardware approaches to
implement DSP, ranging from the use of off-the-shelf microprocessors to field-programmable
gate arrays (FPGAs) to custom integrated circuits (ICs). Programmable
"DSP processors," a class of microprocessors optimized for DSP, are
a popular solution for several reasons. In comparison to fixed-function solutions,
they have the advantage of potentially being reprogrammed in the field, allowing
product upgrades or fixes. They are often more cost-effective than custom hardware,
particularly for low-volume applications, where the development cost of ICs may
be prohibitive. DSP processors often have an advantage in terms of speed, cost,
and energy efficiency.
DSP
Algorithms Mould DSP Architectures From the outset, DSP algorithms have
moulded DSP processor architectures. For nearly every feature found in a DSP processor,
there are associated DSP algorithms whose computation is in some way eased by
inclusion of this feature. Therefore, perhaps the best way to understand the evolution
of DSP architectures is to examine typical DSP algorithms and identify how their
computational requirements have influenced the architectures of DSP processors.
Fast
Multipliers The FIR filter is mathematically expressed as a vector of input
data, along with a vector of filter coefficients. For each "tap" of
the filter, a data sample is multiplied by a filter coefficient, with the result
added to a running sum for all of the taps . Hence, the main component of the
FIR filter algorithm is a dot product: multiply and add, multiply and add. These
operations are not unique to the FIR filter algorithm; in fact, multiplication
is one of the most common operations performed in signal processing convolution,
IIR filtering, and Fourier transforms also all involve heavy use of multiply-accumulate
operations. Originally, microprocessors implemented multiplications by a series
of shift and add operations, each of which consumed one or more clock cycles.
As might be expected, faster multiplication hardware yields faster performance
in many DSP algorithms, and for this reason all modern DSP processors include
at least one dedicated single- cycle multiplier or combined multiply-accumulate
(MAC) unit. Multiple Execution Units DSP
applications typically have very high computational requirements in comparison
to other types of computing tasks, since they often must execute DSP algorithms
in real time on lengthy segments of signals sampled at 10-100 KHz or higher. Hence,
DSP processors often include several independent execution units that are capable
of operating in parallel for example, in addition to the MAC unit, they typically
contain an arithmetic- logic unit (ALU) and a shifter. Efficient
Memory Accesses Executing a MAC in every clock cycle requires more than
just a single-cycle MAC unit. It also requires the ability to fetch the MAC instruction,
a data sample, and a filter coefficient from memory in a single cycle. To address
the need for increased memory bandwidth, early DSP processors developed different
memory architectures that could support multiple memory accesses per cycle. Often,
instructions were stored in the memory bank, while data was stored in another.
With this arrangement, the processor could fetch an instruction and a data operand
in parallel in every cycle. Since many DSP algorithms consume two data operands
per instruction, a further optimization commonly used is to include a small bank
of RAM near the processor core that is used as an instruction cache. When a small
group of instructions is executed repeatedly, the cache is loaded with those instructions,
freeing the instruction bus to be used for data fetches instead of instruction
fetches thus enabling the processor to execute a MAC in a single cycle. High memory
bandwidth requirements are often further supported via dedicated hardware for
calculating memory addresses.
These address generation units operate in parallel
with the DSP processor's main execution units, enabling it to access data at new
locations in memory without pausing to calculate the new address. Memory accesses
in DSP algorithms tend to exhibit very predictable patterns; for example, for
each sample in an FIR filter, the filter coefficients are accessed sequentially
from start to finish for each sample, then accesses start over from the beginning
of the coefficient vector when processing the next input sample.
You may also like this : Crusoe Processor, Human Computer Interface , HPJava, Gaming Consoles, Fluorescent Multi-layer Disc, Futex, Extreme Programming (XP), Earth Simulator, Compact peripheral component interconnect, corDECT Wireless in Local Loop System, Param 10000, Elastic Quotas, Refactoring, On-line Analytical Processing (OLAP), Pivot Vector Space Approach in Audio-Video Mixing, QoS in Cellular Networks Based on MPT, Wireless Fidelity, Voice morphing, Radio Frequency Light Sources, Speed Detection of moving vehicle using speed cameras, Optical Packet Switching Network, Storage Area Networks, Smart Note Taker, MPEG-7, Motes, Modular Computing, MiniDisc system, Migration From GSM Network To GPRS, M-Commerce, C# , IP Telephony, RPR, Broad Band Over Power Line, Rapid Prototyping , Dashboard , Optical Satellite Communication, Optical packet switch architectures, Layer 3 Switching , Intrution Detection System, Multiterabit Networks, InfiniBand, Light Tree , Multicast, Inverse Multiplexing, Neural Networks And Their Applications, Parallel Computing In India, Quadrics Interconnection Network, Structured Cabling, Virtual LAN Technology, RTOS/RTSI,IT Seminar Reports, PPT and PDF.
|
<<back |