Multi-core processor performance depends on a software developer's ability to make efficient use of the additional cores. Multi-core processors offer developers more avenues to increase system computing performance. Most high-performance CPUs implement cache memory to improve performance. To lower data latency, the execution unit is surrounded by small pieces of high-speed SRAM. The execution unit can access cache memory about 80 times faster than it can system memory. A packet-processing application that's performing TCP reassembly on a large number of TCP flows is likely to access a large amount of data over many memory locations. This results in a lower locality of reference than an application that's performing TCP reassembly on a smaller number of TCP flows. The transition to multi-core processors promises more than an increase in the number of execution cores and computational capability. It offers additional flexibility for development and optimization of higher performance applications. In particular, multi-core architectures can significantly improve program flow so that cache memory associated with each individual execution core is used more effectively. With multiple caches available to software developers, it's possible to optimize data locality, driving higher cache hit rates and improving overall application performance.

 

Many service providers are pursuing IP-based technology to help them deploy new video and rich media services and generate new revenue streams. This means delivering voice, video, broadband Internet and mobile services, and enriching the experience of customers. Keeping up with changing market demands, equipment makers are providing platforms that seamlessly connect multiple access networks, like PSTN, xDSL, Wi-Fi and corporate LAN, using packet-based technologies that can enable a broad mix of services. Similarly, packet processing is integral to a range of applications such as intrusion detection, VPN, firewall, gateways, routers and storage. Equipment manufacturers are under pressure to deliver systems supporting converged IP-based traffic. When developing any application, engineers need to make choices as to how much overhead their system can handle while producing acceptable system performance. As with any multi-processor system, the predominant challenge is to ensure that all the processors are kept busy doing useful work rather than wasting CPU cycles waiting for another core to release a shared resource

 

 

A new class of multi-core processor has begun to appear in a variety of storage, security, wireless base stations, and networking applications. This new class of multi-core processor is made up of eight, sixteen, or even sixty-four individual processor cores with integrated memory controllers, various I/O interfaces, and separate acceleration engines. The new class of processor has made great strides in overcoming the limitations of earlier generation processors. Some companies that develop these processors add threading capability to overcome memory latency, and also include a native 10 Gb/s interface, while others include security engines and even regular expression engines that support very special applications.