Skip navigation

The Massive Data Center Refresh

Posted by pkmatz Jul 30, 2010

Virtualization is what was behind the big capex dollars being saved by Dell – $200 million/year – by running on 10,000 virtual servers. HP also suffered from data center sprawl with 85 data centers in 29 countries. It succeeded in boiling it all down to six new greenfield data centers with encouraging results: the number of servers are now down by 40 percent; processing power is up 250 percent; and overall data center costs have been slashed by 60 percent.


In light of recent downtime failures, Twitter is planning its first data center located in Salt Lake City, which will help it manage its exponential growth of 300,000 new users per day. Twitter is the third high-profile technology company to build a significant data center there, after eBay, Oracle Corp, and the National Security Administration. The NSA also made recent news about its $1 billion data center investment at Camp Williams in Utah to support its intelligence-gathering operations.


And on it goes. But the work is just beginning.


Market analyst Gartner has estimated that nearly a million servers could have been replaced more than a year ago but weren't. Incredulously, most organizations are unable to provide an accurate server count or even determine what software is running on those servers. To figure it out manually is timely and costly, so to determine where to generate cost savings is challenging and, thus, risky to execute them. It’s not a surprise then to find the data center world so bloated.


What is the actual server utilization at each center? Some reports indicate that typical servers are only being used ten to 15 percent of the time! The result has been to initiate the huge endeavor to consolidate and virtualize the management of data and, consequently, develop a range of cloud services.


At Intel®, its recent launch of the Xeon® 5600 processor series of multicore processors spurred on the drive for ODMs and equipment vendors to build new servers that can better justify the business case for server consolidation and virtualization. Intel says the monthly cost of not converting 50 single-core servers with three 5600-based servers is approximately $10,000 a month in software support, and utility and warranty costs.


The new Intel® Xeon® 5600 server processors offer 6 cores and can be used as the virtualization standard for enterprise private clouds. Utilizing these cores for standard server tasks allows more to get done with less. Virtualized servers are also easily re-configurable and cut down on configuration timetables.


Intel itself went through the server consolidation exercise and already has gained a 250\% increase in capability, at 60\% of the cost.  The goal is to make the upgrades pay for themselves within a year.  The Xeon® 5600 processor series feature a 7-year lifecycle support and are built for thermally constrained and robust communications environments, and can direct applications more quickly and efficiently between connected devices.


Intel® designed this Xeon with a three-prong focus – to blend security, performance and energy efficiency, all of which can significantly improve the economics of data center operations. Intel reports that one 5600 processor can replace 15 single-core servers, deliver 60 percent better performance than the Xeon 5500 and achieve a return on investment in as little as five months.


The Intel Intelligent Power Technology is what can control only what is needed for the power supply of both CPU and memory, at any particular time. This enables IT administrators to pre-set SLAs for applications that require continuous high processing and save energy for those that can afford to be run at lower frequencies.


On the virtualization front, the Intel® Xeon® processor 5600/5500 series boosts virtualization performance by allowing the OS more direct access to the hardware. Also, the Intel® Virtualization Technology2 (Intel® VT) FlexMigration enables seamless migration of running applications among current and future Intel® processor-based servers. Intel® VT FlexPriority improves virtualization performance by allowing guest OSs to read and change task priorities without VMM intervention. While the Intel® 5520 chipset Intel® VT for Directed I/O helps speed data movement and gives designated VMs their own dedicated I/O devices, thus reducing performance overhead of the VMM in managing I/O traffic.


Kontron has been in the embedded motherboard business for several years, and due to its extended experience in high-performance server IA-based ATCA blade and CPCI board products, introduced last year its first server-class embedded motherboard, the KTC5520-EATX.


This server board supports both the Xeon® 5500 and 5600 processor series, and is currently being used as a key building block for an assortment of communication infrastructure systems, including security servers, appliances, storage servers, and carrier-grade rack-mount servers. The Kontron MB design team has a vast amount of hardware, BIOS and IPMI know-how, enabling them to work with vendors to design a server board exactly to their server requirements.


An attractive selling point for network managers is the server board’s built-in ability to be fully managed remotely. Kontron uses an Integrated Management Processor (IMP) that integrates VGA/2D, BMC, and KVM/VM over IP to support real-time access with full control by keyboard, video monitor and mouse (KVM) and virtual media (VM) by a single local computer from anywhere, at any time. IPMI 2.0 compliant using IPMI over LAN, the server board provides the OS-independent and cross-platform interface for monitoring the server system’s temperature, voltage, and fan status, among other items, and permits out-of-band management even when the main processors are not powered-in. I/O features include two 10/100/1000 Mbps Ethernet (Intel 82576EB), 6 SATA ports (3Gb/s), Integrated VGA (BMC), and HD 5.1 Channel Audio. Expansion slots feature 1 PCIe Gen 2 x8 using (x16) slot, 3 PCIE Gen2 X8, 1 PCIe x4 using (x8) slot, and 1 PCI 32/33 5V.


The Kontron KTC5520 is instrumental in the building of robust IT platforms for virtualization, server consolidation, mission critical business and database applications. The advantage of six cores (single or dual socket) is pretty self evident in performance-intensive applications which, with the support of Intel® Hyper-Threading technology, can work with 12 threads simultaneously. Processors within the Xeon 5600 family range from a four core L5609 at 1.8GHz all the way up to a six core X5680 running a 3.33GHz.  All chips have 12MB of L3 cache regardless of core count.


With enterprises now looking to ramp up investment in server consolidation, the time is ripe for new server systems to be designed to help make data centers greener, powerful, and easier on the bottom line.

Communications applications and specifically packet processing and forwarding were once the sole domain of highly-specialized network gear. Today, however, mainstream processors in the Intel® Architecture family can handle packet-processing tasks in a wide variety of applications and embedded design teams may face the challenge of supporting such capability. The combination of software such as Wind River's* asymmetric-multiprocessing (AMP) Network Acceleration Platform that leverages IA features such as multi-core architectures, Intel® Virtualization Technology (Intel® VT), and the Intel® QuickPath Interconnect (QPI) can deliver wire-speed Gigabit Ethernet performance.


Processors across the entire IA family can support some level of communications oriented tasks. For instance the Tolapai Intel® EP80579 SoC can enable communications appliances for applications such as network security.


Today, however, let's talk serious bandwidth, Let's discuss applications for products tied to 3G or 4G wireless networks or even high-speed wired networks, Can an IA processor function in the so-called data plane and handle wire-speed tasks? The answer is a clear yes for the high end of the IA family and specifically the Intel® Xeon® processor 5500 and 5600 series.


Until recently, the prevailing wisdom was that general-purpose processors worked great in the control plane in a communications system, but that the data plane required specialized packet-processing ICs. That may still be the best solution for the routers and switches that lie at the heart of the networks. But such networks need a variety of application-specific products on the network to enable features such as multimedia content delivery. And off-the-shelf products based on general-purpose products offer the fastest, most-flexible, and most cost-effective path to such deployment.


The answer to the performance requirement lies in the combination of the evolving processor feature set and innovative approaches to the software task including Wind Rivers AMP-based approach. Intel has continued to evolve the aggregate performance capabilities of IA ICs both through more processing cores and other innovative features. Recently for instance, I discussed the new six-core Xeon 5600 that lies at the high end of the family.


Other key features include VT that allows multiple operating systems and applications to be run simultaneously on multi-core and multithreaded processors. You might peruse this recent VT post on the embedded community and a search on the site will reveal far more background information.


The peripheral hardware on IA processors has also evolved to support faster data movement on and off chip. For example, I covered QPI in a recent post, and that technology is proving key in data movement on Xeon series that are based on the Nehalem microarchitecture.


Wind River has leveraged the capabilities of Nehalem with its recently announced Network Acceleration Platform based on AMP. The company has demonstrated wire-speed packet processing using a standard Xeon 5500 series board.



Wind River measured IPv4 forwarding performance of 21-million packets per second using a 5500-based board with two cores and four threads dedicated to packet processing. The company noted that layer-3 packet forwarding is a readily-accepted measure of packet-processing efficiency. The demonstration achieved the needed performance to handle the forwarding throughput of 14 Gigabit Ethernet ports. And the design is scalable across multi-core systems.


The Wind River approach is based on AMP whereas the bulk of multiprocessing systems use the more common symmetrical multiprocessing (SMP) scheme. In SMP schemes, any tasks can run on any available core. And the SMP-capable operating system underlies and schedules all of the tasks.


The problem with SMP is that there is operating-system overhead that comes with the convenience of the flexible task scheduling. Wind River claims that its AMP-based test demonstrated five times the performance that an SMP-based implementation would deliver on the same processor board.


Wind River uses its own hypervisor and VT technology to partition the system. Control plane tasks run on a portion of the available cores atop a typical operating system such as the company's VxWorks real-time operating system or Wind River Linux. The packet-processing tasks are explicitly assigned to a set of cores that do not run a traditional operating system. Wind River has what it calls the Wind River Executive that provides a bare-bones set of resources for the packet-processing algorithms. And the executive imparts minimal overhead.


The Network Acceleration Platform also scales to support more traffic with more cores. The scheme can allocate the number of cores required for control-plane tasks and apply the remainder for data-plane packet processing. And adding processors essentially adds data-plane capacity.


Wind River has an excellent paper called "Multi-core Network Acceleration" that's available with free registration on the company's web site.


Have you applied AMP techniques to accelerate the execution of critical elements of a multiprocessing application? Has an SMP approach come up short in one of your applications? Please share your thoughts experience with fellow followers of the Intel® Embedded Community. Your peers would greatly appreciate reading comments that might shine a light of a troublesome multiprocessing issue.


Maury Wright

Roving Reporter (Intel Contractor)

Intel® Embedded Alliance


*Wind River is an Associate Member of the Intel® Embedded Alliance



Ultrasound is one of the most widely used diagnostic tools in modern medicine. In the late 1970s ultrasound required specially trained physicians to interpret the results from technically driven analog equipment. Some research equipment featured realistic two dimensional grey scale images, but the complexity and excessive tuning requirements to keep the equipment functioning correctly restricted its use to very specialized applications. Nevertheless, the benefits in diagnostics, especially in gynecological exams, caused ultrasound to become widely used.


Ultrasound is technically any sound above 20kHz, but practical medical ultrasound is often targeted between 1 and 2 MHz with some systems working as high as 50MHz. The high frequency sound is generated by ultrasonic elements, configured in an array for some types of equipment. The Phased Array probe is made of many ultrasonic elements, each of which is independently pulsed. Varying the timing, by pulsing the elements in sequence along a row, creates a pattern of interference which creates a beam at a specific angle, that can be steered electronically. The beam is swept through the body’s tissue, and the data from multiple beams are assembled mathematically to create an image showing a slice through the body.


A Phased Array system can be very powerful, but the probe is fairly expensive. Even though the probe costs more than alternative technologies, the benefits are significant. Having an electronically steerable acoustic beam enables massive amounts of data to be gathered quickly. Large amounts of data means that there is a significant computational load placed on the central processor. A key problem with this type of imaging is that it’s impossible to make the sound beams thin enough to resolve structures directly. Overlapping sound sources means that each bit of output data must be derived from the interaction of reflections in a small volume around the point being scanned. The result is a specific type of blurring in the output images called “specklation” or “speckle”.


Recovering an image from the data set is computationally intensive.  Intel processors like the ATOM™ processor can run a library of real world signal processing routines to recover and deblur the ultrasonic image. In addition, more processing power can be achieved by employing a multicore processor. Alternatively, multiple multicore processors may be used to divide the application into distinct functions.


Advantech’s (1) AIMB-210 board was selected to form the base of a medical ultrasound system.




According to the company, the ATOM™  board was chosen because the customer was looking for a special type of reliable industrial-grade computer. Product support and reliability were important, as were medical grade features. The Advantech board was selected as a powerful and reliable computing platform, supporting I/O connectivity and performance to control other devices. One feature, often overlooked when considering office environment based equipment, is fan noise. Physician offices require low noise levels, making low dissipated power key.


Mathematical libraries suitable for image reconstruction are available from many sources, including  The Math Forum. If you’re looking for a way to learn more about ultrasound for medical applications, there’s an application for a mobile phone called MobileUS. It’s a program to use industry standard USB-based ultrasound probes with a cell phone. The C# software may be licensed under the BSD license. (licensing was the topic of a recent blog)  

USB Ultrasound (USBUS) is another ultrasound program that uses USB to interface to probes from Interson. These probes and associated software do not provide full imaging, with the image processing performed in the probe itself.


Real ultrasound equipment has other system requirements that can be satisfied by Intel technology. One example of Intel-enabled technology is fastboot from QNX Software Systems(2). QNX fastboot technology is a specific feature of the QNX Neutrino® RealTime Operating System. It eliminates the need for a BIOS on Intel ATOM platforms, reducing system costs while improving instant-on performance. Systems designers can use fastboot technology to deliver fast boot times for a wide variety of medical and other applications.


Considering the image of the Advantech-based ultrasonic equipment, there are a number of systems requirements that are not directly related to the technical requirements of controlling or displaying the results from a phased array ultrasonic system.  A LAN interface is a typical data communications requirement for modern medical diagnostic equipment. In another blog we’ve studied several systems that use Ethernet for LAN connection.  USB interfaces are specified for controlling a printer, DVD recorder/player, and ultrasound probe. USB support is a fundamental part of many packages including those from Green Hills Software (3) and Wind River Systems (4).


Fortunately, the traditional data processing systems functions are a part of offerings from Intel, Green Hills, and Wind River.  But, the lion’s share of code that needs to be developed is outside of commercial offerings – an issue for products outside of the norm for large volume applications.


When your applications are not “mainstream” for commercial vendors, how will you make the choice of tool/library vendor?



1. Advantech is a Premier Member of the Intel® Embedded Alliance

2. QNX is an Associate Member of the  Intel® Embedded Alliance

3. Green Hills Software is an Affiliate Member of the  Intel® Embedded Alliance

4. Wind River Systems is an Associate Member of the  Intel® Embedded Alliance


Henry Davis
Roving Reporter (Intel Contractor)
Intel(r) Embedded Alliance


Industrial and military robots come in all shapes and sizes, tethered and untethered, autonomous or human operated, and designed for many different environments. But no matter how advanced robots are today, we’re still a long way from having a humanoid robot from science fiction.   


Where do you start in assembling a practical military (or industrial) robot? As with any engineering task, we need a set of objectives and a mission, which in turn defines the capabilities that need to be part of the robot.  With this in mind, it’s instructive to consider some current military robots.


Most everyone has seen Explosive Ordnance Disposal (EOD) robots demonstrated many times during the last few years, but military robots have a punctuated long standing battlefield presence.  One of the earliest robots was used during World War II.




Goliath was a rudimentary, remote controlled, demolition robot – the first use of military robots. Today development of military robots is proceeding at a breakneck pace, fueled by recent international conflicts. The TALON type robot is typical of these land-based developments.




Despite the age difference between Goliath and base level TALON, both robots rely on tracks for propulsion. Looking at TALON, tracks have the advantage of simplicity and ruggedness. As with most wheeled or tracked vehicles, they are relatively simple mechanically while retaining ruggedness.  This compares with mechanically complex articulated walking platforms like Boston Dynamics BigDog.


Military and industrial robots have a wide variety of uses in many different environments. Considering land based robots only, there are a number of essential functional requirements:


  1. Communication
  2. Navigation
  3. Propulsion
  4. Sensing
  5. Effecting


Communication for robots began as a concept of operator-machine communications. Today modern battlefields, and industrial spaces, have adopted what is known in military circles as “network centric warfare.”  The name is somewhat misleading as the “network” in the title is not literally an electronic network but rather it describes the way modern militaries will organize and fight in the Information Age. Network Concentric Warfare (NCW) translates information superiority into combat power. This is achieved by effectively linking knowledgeable entities in the battlespace. The physical domain is where events take place and are perceived by sensors and individuals. Data emerging from the physical domain is transmitted through an information domain. Data is subsequently received and processed by a cognitive domain - where it is assessed and acted upon. Effectively, the NCW process reproduces the US Military "observe, orient, decide, act" loop.


The physical network can be implemented through libraries available from companies like Wind River Systems(1) and Green Hills Software(2). Wind River Platform for Industrial Automation combines together a Real Time Operating System (RTOS), network software, and other middleware that may be used to develop state-of-the-art applications, including robots. 


In addition to commercial product offerings, you can also employ Open Source libraries: depending on what protocol completeness you want to use – a thin client or full featured TCP. ENet's purpose is to provide a relatively thin, simple and robust network communication layer on top of UDP (User Datagram Protocol). The primary feature it provides is optional reliable, in-order delivery of packets. For full TCP one of the libraries is STLPlus .



One significant advantage to this specific Open Source library is the independence from other communications packages - it doesn’t require Windows message handling or threading to work. STLPlus is a licensed software package and uses a BSD-style license too. (Open Source licenses were covered in an earlier blog).


Military planners prefer peer-to-peer networking as part of NCW. There are numerous advantages to a peer-to-peer network including the elimination of a central server as a requirement for systems operation. From a military operations standpoint, elimination of a centralized server removes one critical failure mechanism that could disable many pieces of equipment. Operationally, peer-to-peer topologies permit sharing of resources between units and also provide a mechanism for improving system robustness. System robustness can be achieved through the use of virtualization<url> to virtualization blogs> techniques combined with certified Operating Systems  like those offered by Green Hills  and Wind River.


In an automotive blog I discussed GPS-based navigation from the retail automotive standpoint. But military systems have a greater requirement for reliability of autonomous operation. In the US Army Integrated Armed Robotic Vehicle-Assault, the Autonomous Navigation System is capable of controlling several other classes of manned and unmanned vehicles.  Where automotive navigation systems are fundamentally a convenience, military autonomous navigation is justified by the risk/rewards/cost assessment. An unmanned assault vehicle that can’t obtain its objective because it has become lost, may be more than simply one piece of lost equipment.  Such occurrences put the objective in jeopardy. There are important differences between domestic US automotive road navigation and military navigation. Where civilian navigation can rely on physical boundaries of a roadbed as part of the steering algorithm, military autonomous vehicles have limited hard boundaries. In many modern battlefields “roads” may be defined by expediencies including cross country navigation with no roads. For this reason, military robots need to be able to find their way via Global Navigation Satellite Systems (GNSS) combined with vision systems, and dead reckoning. GNSS is not available in parking garages, tunnels, areas with high rise building that eclipse satellite broadcasts. In addition, designers of military equipment must provide a mechanism for navigation. An Open Source project is underway to develop navigation systems. Perhaps the most recognized autonomous vehicle is Stanford University’s project Stanley which won the DARPA challenge.


Propulsion is largely a matter of the expected terrain to be encountered. Many military and industrial mobile robots use tracks because they are relatively simple and robust. But there are alternatives that have different tradeoffs. For example, research in self-balancing two wheeled robots has yielded a basic balancing robot. Tracks and wheels have problems with obstacles that are too high, but are of no consequence for people. Walking robots provide a unique alternative.  A four legged walker is much heavier than tracked or wheeled versions.  Walking robot platforms are remarkable, but use significant power to move. There seems to be no end to biologic- motivated developments - recent research from Boston Dynamics has also yielded wheeled robots that can jump.


Sensing and effecting have a wide variety of alternatives. Sensors can use visible spectrum cameras, LIDAR, RADAR, magnetic sensors, and more. Effectors are mechanical interfaces to physical objects. Pincers, grippers, articulated appendages, rotators, and trigger pulls are just a few of the effectors. Bio-models have a critical role to play in future developments. Boston Dynamics offers Digital Biomechanics. It is the world's first simulation tool aimed at permitting engineers to evaluate the effects of equipment, clothing, and tasks on human soldiers. Boston Dynamics has also used the tool to model advanced robotic systems such as BigDog, PETMAN and others.


All of these hardware alternatives have a mechanism used to define the hardware actions in MARIE, a heterogeneous modeling system. The primary goal of MARIE is to enable quick reuse of mechanical systems that may apply to robot design. This type of tool is critical for robot development. As much as novel mechanical structures are essential to developing a practical robot, the interaction of software components is critical to success. Many of the robots reported in the literature employ high end microcontrollers as expedient processors, but they lack the processing power needed to build commercially viable robots.





The SPARTICUS social interaction robot is mobile and can parse human input using three 1.6GHz processors to perform its tasks. Today, those same Pentium® class processors can be easily replaced by a multicore ATOM™ processor.


Considering the complexity of this simple social interaction robot, how will you design your next robot?



1. Wind River Systems is an Associate Member of the Intel Embedded Alliance

2. Green Hills Software is an Affiliate Member of the Intel Embedded Allaince



Henry Davis
Roving Reporter (Intel Contractor)
Intel(r) Embedded Alliance



Filter Blog

By date: By tag: