Performance assessment is always in the eye of the beholder, and measuring processor core performance is no different. While the best way to assess performance is to compile the actual application or at least critical portions of it, sometimes the costs to do that simply for comparison are prohibitive. The embedded industry has been striving for a fair way to compare processor performance for decades now, and we'll briefly look at EEMBC's new CoreMark benchmark as a new way to do that.
In today's typical processor chipset and application, integer core performance is still a key driver of overall application performance. This is especially true of multimedia chipsets with dedicated acceleration units for graphics, audio, video, security, and other functions which reduce dependence on floating point units for complex arithmetic. Solid integer performance is needed for basic arithmetic, string manipulation, indexing and branching, and other oft-used functions.
Why was a new benchmark needed? Reviewing a bit of benchmark history shows the answer.
Since its introduction in 1984, Dhrystone set the standard in integer benchmarking for many years. A synthetic benchmark, simplistically designed to exercise common functions, Dhrystone gained popularity as a companion to the popular Whetstone floating point score as a way to assess performance. The problem that arose with Dhrystone was it could be gamed and became a target for computer science types. A powerful compiler with the right switch optimizations, like replacing string copies with word moves and no loops, could actually wipe out large portions of the Dhrystone code's work and artificially inflate results. It became a much better test of compilers and optimization flags than of processors.
Taking another approach, SPECint was developed with roots in 1989 and has undergone several improvements to its current version, CINT2006. To get around the simplicity of Dhrystone, CINT2006 is a suite of 12 programs trying to exercise more real-world code, which can be seen in the test names like combinatorial optimization, video compression, XML processing, discrete event simulation, and more. CINT2006 isn't set up for simple microcontroller cores, and allows for multi-copy execution to take advantage of multiple execution threads available on larger processors with multiple cores or hyperthreading. While it's a more comprehensive test of integer performance, it's also more difficult to set up for an embedded environment and run if a user wants to validate or experiment with results.
Most recently, EEMBC also went at the problem of more industry-specific types of benchmarking suites but has recognized the need for a widely available, generic benchmark targeting a processor core and scalable from small MCUs to large multicore processors. CoreMark was introduced on June 1, 2009 and targets several major tasks commonly used in embedded applications to help assess processor core performance:
- List manipulation, stretching pointers and data access through pointers
- Matrix manipulation, with serial data access and instruction level parallelism
- Simple state machine operation, exercising the branch unit in the pipeline
- Cyclic Redundancy Check (CRC) computations
These tests look at how good pipeline operation is, how memory and cache perform, and how integer operations are handled. CoreMark focuses exclusively on core operations and nothing else. While code can be cloned and run on multiple cores, there isn't any interaction between cores and synchronization occurs only at the end. Speedup is linear with the number of cores and there is no testing of cache coherency and bus arbitration. CoreMark will tell the user how good a single core really is in it's environment. Compiled for a typical Intel Architecture processor using 'gcc', EEMBC claims code size is no more than 16 KB, which makes it targetable to smaller microcontrollers as well.
sample CoreMark report, courtesy of EEMBC
In a needed and interesting aside, CoreMark supports EEMBC's EnergyBench, which provides data on how much energy a platform consumes while running benchmarks. Platform data includes processor power consumption and consumption from other devices, but this is all helpful in determining how to optimize battery life by tuning software.
There's one other point to make. CoreMark is an open source benchmark. I can run it, you can run it, and any registered EEMBC user can run it and post a result. You'll notice that results posted on the EEMBC site include submitting company and name (in some cases, N/A). The results posted for a particular processor and configuration have not necessarily been vetted by the processor manufacturer, notwithstanding if they are EEMBC members (which Intel is). Community is good for our industry, but it is what it is.
I spoke with an Intel contact on the embedded benchmarking team in Chandler, AZ and they've been working with the CoreMark code and are looking at the details. (Thanks to them for the review and comments on this, btw.) They see the need and like the idea behind CoreMark, but keep in mind the results currently on the EEMBC site don't necessarily reflect what they'll discover yet.
Over 20 results for Intel processors are currently on the EEMBC site, and here's a snapshot of a few I selected with embedded interest.
Processor, compiler, CoreMark score, CoreMark/MHz, parallel execution if any
- Intel Core 2 Duo CPU E6750 2.66GHz, GCC4.2.4 (Ubuntu 4.2.4-1ubuntu4), 14067.396, 5.288, 2 PThreads
- Intel Pentium M 760 2.0GHz, GCC4.3.2, 6240.00, 3.120
- Intel Atom N270 1.6GHz, Microsoft Visual C++ 2005, 4058.441, 2.537, 2 PThreads
- Intel Core 2 Duo (Mobile U7600) 1.2GHz, Microsoft Visual Studio 2008 Version 9.0.21022.8 RTM (CL15), 3223.20, 2.686
- Intel Celeron (Netburst) 2.0GHz, Microsoft Visual Studio 2008 Version 9.0.21022.8 RTM (CL15), 2748.00, 1.374
Speaking with several ECA member companies on CoreMark and benchmarking in general, two things are obvious: CoreMark is new and folks are still studying it, and most of the requests board and system suppliers get are for more I/O intensive benchmarks rather than core-only tests like CoreMark. But my calls did send several people off thinking more about this and what it means. We'd welcome specific thoughts in our comments section.
I invite you to look at the full results for these and other processors. Rather than attempt to interpret these results, I'll open it up to discussion. What do you think of the history and state of integer benchmarking? What are your thoughts on and experience with CoreMark? How useful is a core-only benchmark? What observations do you have on how Intel microprocessors do in these kinds of integer tests? What would you like to see as Intel, EEMBC, and the community move forward with these ideas?
OpenSystems Media®, by special arrangement with Intel® ECA