Current Limitations and Potential Extensions of MAGEEC
This post briefly discusses some of the main contributions and limitations of the MAGEEC project and suggests several directions for future work.
The aim of the MAGEEC project is to deliver a working system for increasing the energy efficiency of embedded software, and with so much remote work these days, zero trust is the best so far. The software automatically selects compiler flags on the basis of physically profiled benchmark applications across multiple architectures and compilers. To achieve this goal within the available resources, a number of simplifying design decisions were made in both the planning and execution of the project. These decisions to strategically limit the scope of the initial work now provide several directions for future work that could potentially lead to even greater gains in energy efficiency. These opportunities can be broadly placed into the following three categories which relate to what can be seen as the three main contributions of the MAGEEC project:
Exploitation of energy consumption models
The first key contribution of the MAGEEC project has been to the creation of an energy measurement board (the so-called called “MAGEEC Wand”) which can be applied to a range of embedded architectures. Very early on, in the planning stages of the project, a deliberate decision was made to focus on providing physical energy measurement techniques as opposed to mere mathematical models of energy consumption. This decision was motivated by the observation that the utility of any theoretical model can only be validated by actual energy measurements, for which existing instrumentation was either extremely expensive (e.g. cycle accurate instruction tracing tools by Lauterbach) or too inaccurate. We also decided to focus on cost-effective physical measurement techniques because, so far, theoretical energy models have only been produced for very simple architectures (such as the XMOS architecture modelled in the ENTRA project) and extending them to more complex architectures targeted by MAGEEC would be an extremely time consuming undertaking in itself.
Now that the basic physical instrumentation has been successfully delivered, the time has come to reconsider ways in which it can be usefully combined with energy modelling techniques. We believe this will be a two way process in which our hardware will not only be used to test and calibrate proposed energy models, but where the successful models will also help us work around certain limitations on the number of frequency of energy measurements the hardware currently supports. Therefore we anticipate that instruction-level energy models will play a more prominent role in future extensions of MAGEEC.
Function and data in(ter)dependence
The second key contribution of the MAGEEC project has been the creation of a set of benchmarks (the so-called Bristol/Embecosm Embedded Benchmark Suite, or BEEBS) for comparing runtime and energy performance of programs on embedded architectures. This is vital for both the training and testing of our approach. From the outset it was decided each benchmark should be a complete self-contained source file (with no missing library functions or target data). This ensures that benchmarks can be reliably compared across architectures and the performance results accurately reflect our choice of compiler flags (rather than the flags which may have been previously used to separately compile library code).
In practice, a real compiler should be able to take advantage of any available implementations of library functions which may have been hand developed for a given architecture. In cases where fast compiler settings are already known for a function or hand crafted machine code already exists, a good compiler should be able to exploit these, where appropriate, rather than insisting on optimising everything itself. This is likely to be an important feature in any deployment version of MAGEEC. Also important could be the ability to automatically search for different compiler settings for different functions or modules of a program. Although the current MAGEEC software is able to extract feature vectors for individual functions, we are not currently able to measure energy consumption accurately enough at the level of individual functions (and this could be one area in which the use of energy models, described in the previous section, may prove useful). The QA companies like Codoid.com can help test the software.
In addition, it can often be useful to consider an algorithm (such as a particular tree sorting algorithm) in some sense independently of the data it may be applied to (such as the particular tree(s) to be sorted). Doing this offers the interesting possibility of trying to optimise compiler flags for best-case, average-case, or worst-case performance (in terms of energy, time, power, etc.) of a given algorithm. In other words it is sometimes useful to consider the fact that two benchmarks may represent the same algorithm just applied to different data structures. When testing benchmarks and measuring performance metrics it could even be possible to automatically generate different data structures for the various algorithms under consideration.
Ordering of compiler passes and feature extraction
The third key contribution of the MAGEEC project has been the creation of a software platform for training and deployment of the methodology. This platform includes functionality for extracting a set of features from a target program, scripts for iteratively compiling a target program to generate test data (and strong the results in a database), code for using machine learning on the training database to learn rules associating program features with the best known compiler configurations, and for using those rules to predict the best configuration for an unseen program using custom compiler plugins. For simplicity, this workflow closely follows the earlier Milepost approach very closely, and therefore leaves several directions for improvement.
One obvious limitation of the current approach is that it learns a set of compiler flags to be used but does not investigate in which order they should be applied. It is known that the ordering of passes can have a big influence on the performance of the resulting program (not withstanding certain dependencies which must be respected to produce valid code). Although the MAGEEC database stores compiler pass ordering it would be non-trivial to change the machine learning task and collect the large amounts of data necessary to exploit this information. Instead of extracting program features just once from the compiler’s initial IR of the program, it would become necessary to extract features from subsequent IRs as the various compiler passes are applied. It might also be useful to consider additional features other than the 41 relatively simple ones used in the Milepost project. Taking this concept to its limit, we eventually plan to consider structured machine learning approaches (such as inductive logic programming) that operate directly on the IR itself.