We now have access to more data than ever before due to the ubiquitous information gathering devices spread around the globe. This Big Data presents significant opportunities and many challenges. Much of this data is sensitive information that can be used to gain insights into individuals and businesses. Large databases and significant processing power are required to store and analyze the information, and robust data security is needed to ensure the data is protected from malicious access.
In this blog I am going to explore the benefits of using the Intel® Xeon® processor E5-2600 series and the Intel® Communications Chipset 89xx Series to implement data processing and enhance big data security. I am using implementation examples from Advantech, a Premier member of the Intel® Intelligent Systems Alliance. The 250-plus members of the Alliance collaborate closely with Intel to create hardware, software, tools, and services to help speed intelligent systems to market.
Processing Big Data
Large quantities of data is collected from many sources including remote and wireless sensors, radio-frequency identification (RFID) readers, cameras, microphones and another information sensing devices. Big data is increasing in size with some datasets running to several petabytes of data. To gain the full benefit from big data systems need to rapidly process large data sets and correlate information across multiple sources.
The processing of unstructured and semi-structured data has been made significantly easier by the Hadoop software library. Hadoop is a scalable framework that supports the distributed processing of big data across clusters of computers. Hadoop scales from a single server up to many thousands of servers making the results of big data analysis accessible to a wide range of businesses.
Big data and Hadoop processing requires significant I/O and storage throughput. The Intel® distribution for Apache Hadoop software integrates hardware-enhanced performance and security capabilities that deliver significant performance gains. The performance of Hadoop can be increased by more than 30 times over previous generations of the Intel® Xeon® processor by using the Intel Distribution for Hadoop with the latest Intel® Xeon® processor E5-2600 series, 10GbE network connections and solid state drives.
Big Data Security
The use of big data creates some significant security issues. Big data is collected through a wide range of information gathering devices. Some data from these devices will contain sensitive information and therefore needs to be protected. This includes Personally Identifiable Information (PII), Protected Health Information (PHI) and Intellectual Property (IP).Much of the data can become more sensitive when viewed in the context of data from other sources. Companies therefore need to consider encrypting some or all data when it is moved and then stored.
The encrypted data is decrypted for processing and then re-encrypted before being moved back to the storage drives. The results of any processing may also need to be encrypted before being stored or moved to an external device. The greatest efficiency and security is achieved by using the same system for processing and decryption/encryption.
Accelerating Big Data Security
Data encryption in software requires significant number of processor cycles. The Intel Distribution for Apache Hadoop is optimized for Intel® Advanced Encryption Standard New Instructions (Intel® AES-NI) supported on Intel Xeon processors. This has been shown to accelerate encryption performance in an Apache Hadoop cluster by 5.3x and decryption performance by 19.8x. Significantly higher security performance can be achieved by using hardware security acceleration that is closely coupled to the processor.
The Intel Communications Chipset 89xx Series integrate hardware acceleration for decryption/encryption and compression and are closely coupled to Intel Xeon processors through PCIe Gen 2.0 and DMI interfaces. As shown in Figure 1 these devices also integrate quad Gigabit Ethernet, PCIe Gen 1 and other I/O interfaces.
Figure 1. Intel® Communications Chipset 89xx Series.
The Intel Distribution for Apache Hadoop supports the Intel® Quick Assist Technology acceleration built into the Intel Communications Chipset 89xx Series. Security performance can be scaled by adding additional Intel Communications Chipset 89xx Series devices.
Scalable Hardware Platform Solutions
Intel Xeon processor E5-2600 series with the Intel Communications Chipset 89xx Series are integrated into a wide range of platform solutions for computing and communications applications including carrier grade servers, network appliances and ATCA. Many of these platform solutions will support the Intel Distribution for Apache Hadoop with hardware acceleration for security.
The Advantech CGS-6000 carrier grade server integrates dual Intel Xeon processors E5-2600 and E5-2600v2. Hardware security acceleration can be added using Advantech PCIe Cards with four Intel Communications Chipset 89xx Series shown in Figure 2. The CGS-6000 system has four full height PCIe x8 slots.
Figure 2. Advantech PCIe Card with four Intel® Communications Chipset 89xx Series.
ATCA is a scalable platform for computing and communications applications. The Advantech MIC-5333 ATCA processor blade integrates the Intel Xeon processor E5-2600 series and the Intel Communications Chipset 89xx Series. The blade also supports additional mezzanine modules with up to four additional Intel Communications Chipset 89xx Series devices. Advantech NetariumTM ATCA System platforms are available with 2-14 slots and optional extended rear transition modules (eRTM). The Advantech eRTM module shown in Figure 3 supports up to four network mezzanine cards (NMC) that can be used for further security acceleration or other functions. NMCs are also supported on the Advantech FWA-6510 Network Appliance.
Figure 3. Advantech ATCA Extended Rear Transition Module (eRTM)
The combination of Intel Xeon processor E5-2600 series, Intel Communications Chipset 89xx Series and Intel Distribution for Apache Hadoop allows system managers to deploy the right hardware and software solutions to enhance big data security.
Contact Featured Alliance Member:
Solutions in this blog:
Roving Reporter (Intel® Contractor), Intel® Intelligent Systems Alliance
Principal Consultant, Earlswood Marketing
Follow me on Twitter: @simon_stanley