Communications applications and specifically packet processing and forwarding were once the sole domain of highly-specialized network gear. Today, however, mainstream processors in the Intel® Architecture family can handle packet-processing tasks in a wide variety of applications and embedded design teams may face the challenge of supporting such capability. The combination of software such as Wind River's* asymmetric-multiprocessing (AMP) Network Acceleration Platform that leverages IA features such as multi-core architectures, Intel® Virtualization Technology (Intel® VT), and the Intel® QuickPath Interconnect (QPI) can deliver wire-speed Gigabit Ethernet performance.


Processors across the entire IA family can support some level of communications oriented tasks. For instance the Tolapai Intel® EP80579 SoC can enable communications appliances for applications such as network security.


Today, however, let's talk serious bandwidth, Let's discuss applications for products tied to 3G or 4G wireless networks or even high-speed wired networks, Can an IA processor function in the so-called data plane and handle wire-speed tasks? The answer is a clear yes for the high end of the IA family and specifically the Intel® Xeon® processor 5500 and 5600 series.


Until recently, the prevailing wisdom was that general-purpose processors worked great in the control plane in a communications system, but that the data plane required specialized packet-processing ICs. That may still be the best solution for the routers and switches that lie at the heart of the networks. But such networks need a variety of application-specific products on the network to enable features such as multimedia content delivery. And off-the-shelf products based on general-purpose products offer the fastest, most-flexible, and most cost-effective path to such deployment.


The answer to the performance requirement lies in the combination of the evolving processor feature set and innovative approaches to the software task including Wind Rivers AMP-based approach. Intel has continued to evolve the aggregate performance capabilities of IA ICs both through more processing cores and other innovative features. Recently for instance, I discussed the new six-core Xeon 5600 that lies at the high end of the family.


Other key features include VT that allows multiple operating systems and applications to be run simultaneously on multi-core and multithreaded processors. You might peruse this recent VT post on the embedded community and a search on the site will reveal far more background information.


The peripheral hardware on IA processors has also evolved to support faster data movement on and off chip. For example, I covered QPI in a recent post, and that technology is proving key in data movement on Xeon series that are based on the Nehalem microarchitecture.


Wind River has leveraged the capabilities of Nehalem with its recently announced Network Acceleration Platform based on AMP. The company has demonstrated wire-speed packet processing using a standard Xeon 5500 series board.



Wind River measured IPv4 forwarding performance of 21-million packets per second using a 5500-based board with two cores and four threads dedicated to packet processing. The company noted that layer-3 packet forwarding is a readily-accepted measure of packet-processing efficiency. The demonstration achieved the needed performance to handle the forwarding throughput of 14 Gigabit Ethernet ports. And the design is scalable across multi-core systems.


The Wind River approach is based on AMP whereas the bulk of multiprocessing systems use the more common symmetrical multiprocessing (SMP) scheme. In SMP schemes, any tasks can run on any available core. And the SMP-capable operating system underlies and schedules all of the tasks.


The problem with SMP is that there is operating-system overhead that comes with the convenience of the flexible task scheduling. Wind River claims that its AMP-based test demonstrated five times the performance that an SMP-based implementation would deliver on the same processor board.


Wind River uses its own hypervisor and VT technology to partition the system. Control plane tasks run on a portion of the available cores atop a typical operating system such as the company's VxWorks real-time operating system or Wind River Linux. The packet-processing tasks are explicitly assigned to a set of cores that do not run a traditional operating system. Wind River has what it calls the Wind River Executive that provides a bare-bones set of resources for the packet-processing algorithms. And the executive imparts minimal overhead.


The Network Acceleration Platform also scales to support more traffic with more cores. The scheme can allocate the number of cores required for control-plane tasks and apply the remainder for data-plane packet processing. And adding processors essentially adds data-plane capacity.


Wind River has an excellent paper called "Multi-core Network Acceleration" that's available with free registration on the company's web site.


Have you applied AMP techniques to accelerate the execution of critical elements of a multiprocessing application? Has an SMP approach come up short in one of your applications? Please share your thoughts experience with fellow followers of the Intel® Embedded Community. Your peers would greatly appreciate reading comments that might shine a light of a troublesome multiprocessing issue.


Maury Wright

Roving Reporter (Intel Contractor)

Intel® Embedded Alliance


*Wind River is an Associate Member of the Intel® Embedded Alliance