A lot of engineers seem to think that microprocessors and FPGAs are competitive technologies. In reality an Intel® Architecture (IA) processor and an FPGA from a vendor such as *Xilinx or Altera** can be complementary processing blocks in compute-intensive applications. Specifically, data-flow applications in communications, imaging, military, and medial fields benefit from the powerful combination of a processor and an FPGA. Moreover the IA affords design teams the opportunity to closely couple a processor and an FPGA to optimize the collaborative-computing approach.
Processors and FPGAs are in reality optimized for quite different types of processing. Processors offer the ultimate in flexibility. There is an incredible universe of software developers that can program IA processors and likewise untold numbers of tools and platforms that can help speed application development toward completion.
FPGAs are far more difficult to program requiring a team with the ability to create RTL code, synthesize that code, perform place-and-route, and verify the design. But an FPGA fabric is inherently capable of parallel processing. Moreover, designers can configure FPGAs to perform a sequential set of algorithms on parallel data streams that flow through the fabric. The generalized high-level block diagram of an FPGA from Xilinx depicts the potential for parallel, sequential processing.
The combination of a processor and an FPGA can be quite powerful. Xilinx, for instance, defines some specific applications. In a military radar application, an FPGA can be dedicated to the compute-intensive beam-forming task while an IA processor handles the remainder of the system functions. FPGAs can perform tasks such as encryption in communications gear. And the parallel capabilities are a good match in implementing specific imaging tasks such as recognizing elements in a video stream.
There are a number of ways that design teams can combine traditional processors and FPGAs. For years the approach was board based. Either the processor and FPGA subsystems were on separate boards in a rugged system such as one based on CompactPCI. Or in a more PC-like environment, the processor was on the motherboard and the FPGA was hosted on a PCI or PCIe board.
Despite the advancements in system-bus performance realized in technologies such as PCIe, a close coupling of processor and FPGA enables greater performance and a wider range of applications. Intel has enabled that close link via the older FSB (Front Side Bus) that was used to link processor and core logic, and more recently via Intel® Quick Path Interconnect (QPI). QPI was introduced with the Nehalem microarchitecture and is now shipping in a variety of processors including the Intel® Xeon® 5500/5600 Processor series and some Intel® Core™ i3, i5, and i7 processors.
A system design that connects the FPGA directly with the FSB or QPI allows the FPGA to share memory access with the processor. That allows memory coherency and minimizes data transfers that were previously required to explicitly send and receive data to the FPGA subsystem.
At the Intel Develop Forum last fall, Xilinx demonstrated the combination of a Virtex FPGA and Xeon IA processors connected via QPI. The demonstration used what is referred to as in-socket accelerators implying that the FPGA is essentially a peer to the IA processor in a multiprocessor system.
The Xilinx demonstration relied on technology from Nallatech who offers a variety of ways to augment an IA implementation with FPGA technology. The company also supports FSB-based FPGA accelerators. Moreover, whether the FPGA is in an FSP or QPI socket, the implementation utilizes the Intel® QuickAssist Technology that includes an Acceleration Abstraction Layer (AAL) to simplify the software development process for an IA system augmented with an accelerator such as an FPGA.
Xilinx also has a whitepaper entitled “High performance computing using FPGAs” that coves the combination of processors and FPGAs. The paper focuses equally on sever applications and embedded application in the military, communications, medical, and imaging segments.
Altera has also supported in-socket accelerators. And the company has a whitepaper entitled “FPGA coprocessing evolution: sustained performance approaches peak performance.” The photo below shows a product from Altera’s partner XtremeData that packs three Stratix FPGAs on a module for an FSB socket.
Intel has also integrated an Altera FPGA in the same package with an Intel® Atom™ processor in the E6x5C series that was code named Stellarton. That combination supports both SOC designs where an embedded teams uses the FPGA to implement specific peripheral functions and applications where the FPGA acts as an accelerator for specific functions.
Do you have experience matching processors and FPGAs in compute-intensive applications? Have you used an FSB approach or have you already deployed a QPI-based design? And have you relied on the QuickAssist AAL? Please share your experience via comments. Fellow followers of the Intel® Embedded Community would greatly appreciate your input.
Roving Reporter (Intel Contractor)
Intel® Embedded Alliance
*Xilinx is an Affiliate member of the Intel® Embedded Alliance
**Altera is an Affiliate member of the Alliance