The second generation of the Intel® Core™ processor family includes higher-performance DSP capability than any previous Intel® Architecture (IA) processor or for that matter any general-purpose microprocessor. Indeed the Intel® Advanced Vector Extensions (AVX) instruction set and single-instruction multiple-data (SIMD) execution unit on processors such as the Intel® Core ™ i7 enable design teams to develop analytics systems without relying on a dedicated DSP IC or FPGA. Example applications include surveillance systems that rely on image processing, military radar systems, and automotive vehicle-classification and driver-assist systems.


I covered some of the details on the AVX instructions and second-generation architecture in a recent post on sensing and analytics. Today let’s discuss the architecture specifically related to DSP applications and have a look at some real benchmark data.


Among the keys to DSP performance is the doubling of the data path and AVX instruction width to 256 bits whereas prior IA processors relied on 128-bit Intel® Streaming SIMD Extensions (SSE). But the processing capability alone isn’t the entire story. Analytics applications require the processor to move rich data streams onto the processor feeding the SIMD execution unit and storing the objects such as elements of an image in memory.


An application such as facial recognition must continuously process captured image frames breaking an image into relatively-small groups of pixels. The processor must execute DSP algorithms on each pixel set. For example, algorithms might correct for camera lens distortion and sharpen the image, perform color space conversion, and filter noise. Such preprocessing must happen before the processor can perform that actual recognition or pattern-matching algorithm.


The new Core processors have several features in addition to SIMD to enable such applications. The on-chip ring interconnect is optimized to move rich data streams, the tiered memory architecture provides the required bandwidth, and the latest PCI Express® Gen2 implementation supports 5 GT/sec (giga transfers per second).


Companies that are devoted to applications such as surveillance have certainly recognized the DSP potential. Indeed GE Intelligent Platforms* has published a new whitepaper entitled “DSP applications to reap benefits from inclusion of AVX in processors.”


The whitepaper covers both AVX and some of the data-movement capabilities that I mentioned above. For example, the paper highlights the three-level cache and the integrated DDR3 memory controllers. According to GE, the memory architecture in aggregate supports 21.35 Gbytes/sec in peak bandwidth.


Still it’s the AVX capability that GE stresses as the key to DSP-centric applications such as surveillance and radar. The whitepaper stresses both the wider instructions and the fact that there is an AVX SIMD unit in each of the two or four cores on i7 processors. Each core cam handle 8 32-bit, or 4 62-bit floating-point operations simultaneously. And a four-core processors offers 4x that capability. GE noted the importance of being able to process 64 operations per clock cycle on a four-core processor.


GE tested the second-generation Core processors using a Synthetic Aperture Radar code benchmark. Relative to first-generation Core processors at similar clock speeds, the new processors offer more than double the DSP performance. The whitepaper also notes that Intel® Hyper-Threading Technology (Intel HT) can boost performance 25 to 30%.


DSP280 board image.jpg


GE offers a broad set of second-generation Core i7 single board computers. The portfolio includes the DSP280 6U OpenVPX (pictured), the XCR14 6U CompactPCI, the XVR14 6U VME, the SBC324 3U OpenVPX, and the SBC624 6U OpenVPX boards. GE offers the products in five levels of ruggedization – from “benign to fully rugged.”


There are several other sources of good information on both implementing DSP algorithms on IA processor and specifically on the AVX capabilities.

Although it was written about prior-generation IA processors and SSE technology, Curtiss-Wright Controls Embedded Computing** wrote an excellent article entitled “Military signal processing with Intel Architecture” that was published in the Embedded Innovator magazine. The article presents a benchmark based on an FFT algorithm. It also describes the data flow through the processor and all of the information presented can be easily applied to the latest IA processors.


You will also find a section of the Intel® Embedded Design Center called “Signal processing on Intel Architecture” that as the title indicates is dedicated to DSP. On that site you will find links to other whitepapers and other information resources on AVX.


AVX offers design teams the ability to reduce system footprint, weight, power consumption, and cost by eliminating the need for other DSP-centric ICs or FPGAs. The embedded industry, and especially the military and aerospace segment, has an acronym for such savings – SWaP (size, weight, and power). GE noted in its whitepaper that SWaP reduction is a key AVX benefit.


Is SWaP a key concern in your projects? Have you utilized SSE or AVX instructions to handle DSP algorithms? What technical hurdles did you face and how did you overcome them. Please share you experiences with fellow followers of the Intel® Embedded Community via comments.


To view other community content focused on sensing and analytics, see “Sensing and Analytics – Top Picks.”



Maury Wright

Roving Reporter (Intel Contractor)

Intel® Embedded Alliance


* General Electric Intelligent Platforms is an Associate member of the Intel® Embedded Alliance

** Curtiss-Wright Controls Embedded Computing is an Affiliate member of the Alliance