Deliverable 4.1

From MAGEEC
Jump to: navigation, search


Deliverable 4.1: Training set source

Status: Core Complete; Potential for extensions

In this deliverable, we aim to supply a large training set of data that the Machine Learning framework can use to learn the relevant features that connect program code and its energy consumption.

We decided that the best approach was to first produce a core set of applications that span the intended embedded application space, before extending it with a larger and larger code base in later phases of the project.


Core source set

We decided that the core set should be a self-contained benchmark suite. To this end, we developed the BEEBS benchmark suite, which includes 10 core applications from across the embedded application space.

BEEBS has then been selected as our core training set source and is again released under an open source license on the MAGEEC github site.

Future extension potential

Initial applications of the core source set to ML systems has indicated that the ML training will perform better if a larger quantity of input programs are available. Therefore, it is likely that we will want to extend the source set at a later date to improve the performance of the overall MAGEEC system.

To this end, we have identified a number of potential expansions to the code base, including:

  • GCC regression suite
  • LLVM nightly test suite
  • GNU coreutils
  • Linux build essentials,
  • Mibench (non-BEEBS tests)

These will be investigated alongside the framework development.