Introducing BEEBS: Benchmarking for energy

by James Pallister on June 20, 2014

One of the goals of MAGEEC is to work with real measurement data for the machine learner to train on. As it has been shown in a previous blog post, we have our own measurement board and a python library to easily work with it. The next step is naturally to take measurements on various platforms and various relevant programs. This is where the Bristol/Embecosm Embedded Energy Benchmark Suite comes in, or BEEBS for short.

You may find the current state and development of BEEBS on GitHub.

A set of benchmarks for embedded devices

BEEBS was created during the summer 2012 and was presented in a research paper in August 2013 with an initial set of benchmarks. The original intention was, and still is, to gather a significant set of benchmarks for assessing the quality of code generated by compilers. And because we are specifically talking about embedded processors here, assessing the energy consumption is just as important as performance.

When gathering benchmarks, we want variety and relevance. Here is a list of sources with have looked at and borrowed from:

Standard C libraries

Standard C libraries are a reliable and easy source for benchmarking. The reason is that each implementation will be very widely used therefore representative of the what is out there. I also mentioned it was easy. In practise, every platform has its implementation of the standard C library so they are built to be portable.

In BEEBS, we have imported parts of newlib for example.

Data manipulation

We have collected code from libraries implementing common data structures, data manipulation, hashing and cryptography.

These algorithms will make use of a lot of data, which allows us to measure the energy consumed by memory accesses. For instance, some microcontroller can save a lot of energy by optimizing where the code is layed out and in which address space.

For example, we have borrowed from sglib, nettle, the WCET project and MiBench.

Compiler regression test suites

Because the main use case of BEEBS is to assess the quality of code generated by a compiler, we found that a good source for benchmarks was regression test suites that you may find with a compiler toolchain. For example, we have used tests from GDB. These benchmarks are simple but centered around a single feature a program may have.

Running BEEBS with the Energy monitoring board

With a little less than a hundred benchmarks in BEEBS at the time of writing, we have been recently working on making it easy to run all of these benchmarks on a target device with an energy board. The target platform we will talk about here will be the Shrimp, a DIY Arduino clone that fits on a breadboard. We chose this simply because we have a lot of them available.

Because an energy monitoring board can measure 3 boards at a time, we have been able to measure 6 Shrimps with the 2 boards we have in the office. Each target board is running all the benchmarks independently of the others. We are using the trigger feature of the energy monitoring board to be able to start and stop a measurement automatically.

There will be a follow-up post which will walk through the steps required to run BEEBS on your own hardware, and generate a CSV table of the energy measurements.